Friday, 28 January

Thursday, 27 January

21:42

GNU poke 2.0 released [Planet GNU]

I am happy to announce a new major release of GNU poke, version 2.0.

This release is the result of a full year of development.  A lot of things have changed and improved with respect to the 1.x series; we have fixed many bugs and added quite a lot of new exciting and useful features.

See the complete release notes at https://jemarch.net/poke-2.0-relnotes.html for a detailed description of what is new in this release.

We have had lots of fun and learned quite a lot in the process; we really wish you will have at least half of that fun using this tool!

The tarball poke-2.0.tar.gz is now available at https://ftp.gnu.org/gnu/poke/poke-2.0.tar.gz.

> GNU poke (http://www.jemarch.net/poke) is an interactive, extensible editor for binary data.  Not limited to editing basic entities such as bits and bytes, it provides a full-fledged procedural, interactive programming language designed to describe data structures and to operate on them.


Thanks to the people who contributed with code and/or documentation to this release.  In certain but no significant order they are:

  • Mohammad-Reza Nabipoor
  • Luca Saiu
  • Bruno Haible
  • Egeyar Bagcioglu
  • David Faust
  • Guillermo E. Martinez
  • Konstantinos Chasialis
  • Matt Ihlenfield
  • Thomas Weißschuh
  • Sergei Trofimovich
  • Fangrui Song
  • Indu Bhagat
  • Jordan Yelloz
  • Morten Linderud
  • Sergio Durigan Junior

As always, thank you all!

Happy poking!

--
Jose E. Marchesi
Frankfurt am Main
28 February 2022

20:28

Quantum of Nightmares: spoiler time! [Charlie's Diary]

Quantum of Nightmares: UK cover

In the before times, a mass market paperback edition usually followed the initial hardcover release of one of my books exactly 12 months later.

But we're not living in the before times any more! The UK paperback of "Quantum of Nightmares" is due in November, but there isn't going to be a US paperback (although the ebook list price will almost certainly drop to reflect a paperback-equivalent price).

So ... I'm open for questions about Quantum of Nightmares in the comment thread below. Ask me anything! Just ignore this thread if you haven't read the book yet and mean to do so in the near future, because there will be spoilers.

18:00

Link [Scripting News]

Dare Obasanjo: "Web3 is great because it’s basically saying let’s take all the lessons we’ve learned from building networked software for 50+ years and throw them all out the window." I agree, but it might work because the current development environment for networked apps is like a hairball on a hairball running on a hairball — incomprehensible, unless you’re already up the learning curve. This is why we always reinvent everything a few times every generation.

12:14

CodeSOD: Three Links [The Daily WTF]

Brian's hired a contractor to tackle a challenging technical problem: they wanted to display three links in three table cells. Now, you or I might just write that as HTML. But if we did that,...

06:28

1578 [Looking For Group]

The post 1578 appeared first on Looking For Group.

Wednesday, 26 January

22:28

Link [Scripting News]

Highly recommended: Eight Days a Week.

18:00

Top Comments – Pages 1575 – 1576 [Looking For Group]

Tuesday, YOU are the star! We curate our favourites from the previous week’s comments on lfg.co and Facebook and remind you how clever you are. Here are your top comments for Looking For Group pages 1575 – 1576. Looking For […]

The post Top Comments – Pages 1575 – 1576 appeared first on Looking For Group.

16:28

The Rehabilitation of Pepe Le Pew [Scenes From A Multiverse]

Howdy! Here’s a comic about a movie that came out last month. Very topical.

Time Flies [Scenes From A Multiverse]

Enjoy this installment of Time Flies! Hope you’re having fun.

The Great Awakening [Scenes From A Multiverse]

The Mysterious Q has been defeated by the forces of facts and failed predictions. Huzzah! Everything will be great and perfect from now on, guaranteed.

Tweeters Never Prosper [Scenes From A Multiverse]

Sometimes, when the world is upside down and you’re surrounded by coups and gamma ray bursts and strange time anomalies, it’s good to focus on the things you can control.

Re: Solutions [Scenes From A Multiverse]

Happy new year, everyone. Let’s try not to fuck this one up too badly.

Direct Messaging [Scenes From A Multiverse]

Not only should we defund the police but we should take their underwear too. They haven’t earned the right to underwear.

Peaceful Transfer [Scenes From A Multiverse]

If you dislike elections, take heart: this one may be the last.

Good luck to us all.

The Nominee [Scenes From A Multiverse]

Hey everyone! I know we all feel like everything is doomed now, but remember: we probably are.

Before then, there’s new pins and books in our store! Please go buy them.

The Adjudication Squad vs. Cruel Multiverse Syndrome [Scenes From A Multiverse]

Things are not looking so great.

But! We do have brand new enamel bunnies pins in the store. So that’s something.

We also have SFAM Book 3: Greetings From Bunnies Planet available to purchase now!

Full Disclosure [Scenes From A Multiverse]

I’ve got a bad feeling about this

Getting Size and File Count of a 25 Million Object S3 Bucket [The Open Source Grid Engine Blog]

Amazon S3 is a highly durable storage service offered by AWS. With eleven 9s (99.999999999%) durability, high bandwidth to EC2 instances and low cost, it is a popular input & output files storage location for Grid Engine jobs. (As a side note, we were at Werner Vogels's AWS Summit 2013 NYC keynote where he disclosed that S3 stores 2 trillion objects and handles 1.1 million requests/second.)

AWS Summit 2013 keynote - Werner Vogels announces over 2T objects stored in S3On the other hand, S3 presents its own set of challenges. For example, unlike a POSIX filesystem, there are no directories in S3 - i.e. S3 buckets have a flat namespace, and files (AWS refers them as "objects") can have delimiters that are used to create pseudo directory structures.

Further, there is no API that returns the size of an S3 bucket or the total number of objects. The only way to find the bucket size is to iteratively perform LIST API calls, each of which gives you information on 1000 objects. In fact the boto library offers a higher level function that abstracts the iterating for us, so all it takes is a few lines of python:
from boto.s3.connection import S3Connection

s3bucket = S3Connection().get_bucket(<name of bucket>)
size = 0

for key in s3bucket.list():
   size += key.size
 

 print "%.3f GB" % (size*1.0/1024/1024/1024)

However, when the above code is run against an S3 bucket with 25 million objects, it takes 2 hours to finish. A closer look at the boto network traffic confirms that the high level list() function is doing all the heavy lifting of calling the lower level S3 LIST (i.e. over 25,000 LIST operations for the bucket). With such a chatty protocol it explains why it takes 2 hours.

With a quick google search we found most people workaround it by either:
  • store the size of each object in a database
  • extract the size information from the AWS billing console
mysql> SELECT SUM(size) FROM s3objects;
+------------+
| SUM(size)  |
+------------+
| 8823199678 |
+------------+
1 row in set (20 min 19.83 sec)
Note that both are not ideal. While it is much quicker to perform a DB query, we are not usually the creator of the S3 bucket (recall that the bucket, which is owned by another AWS account, stores job input files). Also, in many cases, we need other information such as the total number of files in the S3 bucket, so the total size shown in the billing console doesn't give us the complete picture. And lastly, we need the information of the bucket now and can't wait until the next day to start the Grid Engine cluster and run jobs.

Running S3 LIST calls in Parallel
Actually, the LIST S3 API takes the prefix parameter, which "limits the response to keys that begin with the specified prefix". Thus, we can run concurrent LIST calls, where one call handles the first half, and the other call handles the second half. Then it is just a matter of reducing and/or merging the numbers!

With 2 workers (we use the Python multiprocessing worker pool) running LIST in parallel, we reduced the run time to 59 minutes, and further to 34 minutes with 4 workers. We then maxed out the 2-core c1.medium instance as both processors were 100% busy. At that point we migrated to a c1.xlarge instance that has 8 cores. With 4 times more CPU cores and higher network bandwidth, we started with 16 workers, which took 11 minutes to finish. Then we upped the number of workers to 24, and it took 9 minutes and 30 seconds!

So what if we use even more workers? While our initial goal is to finish the pre-processing of the input bucket in less than 15 minutes, we believe we can extract more parallelism from S3, as S3 is not just 1 but multiple servers that are designed to scale with a large number of operations.

Taking Advantage of S3 Key Hashing
So as a final test, we picked a 32-vCPU instance and ran 64 workers. It only took 3 minutes and 37 seconds to finish. Then we observed that the work we distribute to the workers doesn't take advantage of S3 key hashing!

As the 25 million object-bucket has keys that start with UUID like names, the first character of the prefix can be 0-9 and a-z (36 characters in total). So key prefixes are generated like:
00, 01, ..., 09, 0a, ..., 0z
10, 11, ..., 19, 1a, ..., 1z

And workers running in parallel at any point in time can have a higher chance of hitting the same S3 server. We performed a loop interchange, and thus the keys become:
00, 10, ..., 90, a0, ..., z0
01, 11, ..., 91, a1, ..., z1
Now it takes 2 minutes and 55 seconds.


Enhanced Networking in the AWS Cloud - Part 2 [The Open Source Grid Engine Blog]

We looked at the AWS Enhanced Networking performance in the previous blog entry, and this week we just finished benchmarking the remaining instance types in the C3 family. C3 instances are extremely popular as they offer the best price-performance for many big data and HPC workloads.

Placement Groups
An additional detail we didn't mention before: we booted all SUT (System Under Test) pairs in their own AWS Placement Groups. By using Placement Groups, instances get higher full-bisection bandwidth, lower and predictable network latency for node-to-node communications.

Bandwidth
With c3.8xlarge instances that have the 10Gbit Ethernet, Enhanced Networking offers 44% higher network throughput. With smaller C3 instance types that have lower network throughput capability, while Enhanced Networking offers better network throughput, the difference is not as big.



Round-trip Latency
The c3.4xlarge and c3.8xlarge have similar network latency as the c3.2xlarge. The network latency for those larger instance types are between 92 and 100 microseconds.



Conclusion
All C3 instance types with Enhanced Networking enabled offer half the latency in many cases for no additional cost.

On the other hand, without Enhanced Networking, bandwidth sensitive applications running on c3.8xlarge instances won't be able to fully take advantage of the 10Gbit Ethernet when there is 1 thread handling network traffic -- which is a common problem decomposition method we have seen in our users' code: MPI for inter-node communication, and OpenMP or even Pthreads for intra-node communication. For those types of hybrid HPC code, there is only 1 MPI task handling network communication. Enhanced Networking offers over 95% of the 10Gbit Ethernet bandwidth for those hybrid code, but when Enhanced Networking is not enabled, the MPI task would only get 68% of the available network bandwidth.

Enhanced Networking in the AWS Cloud [The Open Source Grid Engine Blog]

At re:Invent 2013, Amazon announced the C3 and I2 instance families that have the higher-performance Xeon Ivy Bridge processors and SSD ephemeral drives, together with support of the new Enhanced Networking feature.

Enhanced Networking - SR-IOV in EC2
Traditionally, EC2 instances send network traffic through the Xen hypervisor. With SR-IOV (Single Root I/O Virtualization) support in the C3 and I2 families, each physical ethernet NIC virtualizes itself as multiple independent PCIe Ethernet NICs, each of which can be assigned to a Xen guest.

Thus, an EC2 instance running on hardware that supports Enhanced Networking can  "own" one of the virtualized network interfaces, which means it can send and receive network traffic without invoking the Xen hypervisor.

Enabling Enhanced Networking is as simple as:

  • Create a VPC and subnet
  • Pick an HVM AMI with the Intel ixgbevf Virtual Function driver
  • Launch a C3 or I2 instance using the HVM AMI

Benchmarking
We use the Amazon Linux AMI, as it already has the ixgbevf driver installed, and Amazon Linux is available in all regions. We use netperf to benchmark C3 instances running in a VPC (ie. Enhanced Networking enabled) against non-VPC (ie. Enhanced Networking disabled).


Bandwidth
Enhanced Networking offers up to 7.3% gain in throughput. Note that with or without enhanced networking, both c3.xlarge and x3.2xlarge almost reach 1 Gbps (which we believe is the hard limit set by Amazon for those instance types).





Round-trip Latency
Many message passing MPI & HPC applications are latency sensitive. Here Enhanced Networking support really shines, with a max. speedup of 2.37 over the normal EC2 networking stack.




Conclusion 1
Amazon says that both the c3.large and c3.xlarge instances have "Moderate" network performance, but we found that c3.large peaks at around 415 Mbps, while c3.xlarge almost reaches 1Gbps. We believe the extra bandwidth headroom is for EBS traffic, as c3.xlarge can be configured as "EBS-optimized" while c3.large cannot.

Conclusion 2 
Notice that c3.2xlarge with enhanced networking enabled has a around-trip latency of 92 millisecond, which is much higher that of the smaller instance types in the C3 family. We repeated the test in both the us-east-1 and us-west-2 regions and got idential results.

Currently AWS has a shortage of C3 instances -- all c3.4xlarge and c3.8xlarge instance launch requests we issued so far resulted in "Insufficient capacity". We are closely monitoring the situration, and we are planning to benchmark the c3.4xlarge and c3.8xlarge instance types and see if we can reproduce the increased latency issue.

Updated Jan 8, 2014: We have published Enhanced Networking in the AWS Cloud (Part 2) that includes the benchmark results for the remaining C3 types.

Running a 10,000-node Grid Engine Cluster in Amazon EC2 [The Open Source Grid Engine Blog]

Recently, we have provisioned a 10,000-node Grid Engine cluster in Amazon EC2 to test the scalability of Grid Engine. As the official maintainer of open-source Grid Engine, we have the obligation to make sure that Grid Engine continues to scale in the modern datacenters.

Grid Engine Scalability - From 1,000 to 10,000 Nodes
From time to time, we receive questions related to Grid Engine scalability, which is not surprising given that modern HPC clusters and compute farms continue to grow in size. Back in 2005, Hess Corp. was  having issues when its Grid Engine cluster exceeded 1,000 nodes. We quickly fixed the limitation in the low-level Grid Engine communication library and contributed the code back to the open source code base. In fact, our code continues to live in every fork of Sun Grid Engine (including Oracle Grid Engine and other commercial versions from smaller players) today.

So Grid Engine can handle thousands of nodes, but can it handle tens of thousands of nodes? Also, how would Grid Engine perform when there are over 10,000 nodes in a single cluster? Previously, besides simulating virtual hosts, we could only take the reactive approach - i.e. customers report issues, and we collect the error logs remotely, and then try to provide workarounds or fixes. In the Cloud age, shouldn't we take a proactive approach as hardware resources are more accessible?

Running Grid Engine in the Cloud
In 2008, my former coworker published a paper about benchmarking Amazon EC2 for HPC workloads (many people have quoted the paper, including me on the Beowulf Cluster mailing list back in 2009), so running Grid Engine in the Cloud is not something unimaginable.

Since we don't want to maintain our own servers, using the Cloud for regression testing makes sense. In 2011, after joining the MIT StarCluster project, we started using MIT StarCluster to help us test new releases, and indeed the Grid Engine 2011.11 release was the first one tested solely in the Cloud. It makes sense to run scalability tests in the Cloud, as we also don't want to maintain a 10,000-node cluster that sits idle most of the time!

Requesting 10,000 nodes in the Cloud
Most of us just request resources in the Cloud and never need to worry about resource contention. But there are default soft limits in EC2 that are set by Amazon for catching run away resource requests. Specifically each account can only have 20 on-demand and 100 spot instances per region. As running 10,000 nodes (instances) exceeds many times the soft limit, we asked Amazon to increase the upper limit of our account. Luckily the process was painless! (And thinking about it, the limit actually makes sense for new EC2 customers - imagine what kind of financial damage an infinite loop with instance request can do. In the extreme case, what kind of damage a hacker with a stolen credit card can do by launching a 10,000-node botnet!)

Launching a 10,000-node Grid Engine Cluster
With the limit increased, we first started a master node on a c1.xlarge (High-CPU Extra Large Instance, a machine with 8 virtual cores and 7GB of memory). We then started adding instances (nodes) in chunks - we picked chunk sizes that are always less than 1000, as we have the ability to change the instance type and the bid price with each request chunk. Also, we mainly requested for spot instances because:

  • Spot instances are usually cheaper than on-demand instances. (As I am writing this blog, a c1.xlarge spot instance is only costing 10.6% of the on-demand price.)
  • It is easier for Amazon to handle our 10,000-node Grid Engine cluster, as Amazon can terminate some of our spot instances if there is a spike in demand for on-demand instances.
  • We can also test the behavior of Grid Engine when some of the nodes go down. With traditional HPC clusters, we needed to kill some of the nodes manually to simulate hardware failure, but spot termination does this for us. We are in fact taking advantage of spot termination!

All went well until we had around 2,000 instances in a region, and we found that further spot requests all got fulfilled and then the instances were terminated almost instantly. A quick look at the error logs and from the AWS Management Console, we found that we exceeded the EBS volume limit, which has a value of 5000 (number of volumes) or a total size of 20 TB. We were using the StarCluster EBS AMI (e.g. ami-899d49e0 in us-east-1) that has a size of 10GB, so 2000 of those instances running  in a region definitely would exceed the 20TB limit!

Luckily, StarCluster offers S3 AMIs, but the OS version is older, and has the older SGE 6.2u5. As all releases of Grid Engine released by the Open Grid Scheduler project are wire-compatible with SGE 6.2u5, we quickly launched a few of those S3 AMIs as a test, and not surprisingly those new nodes joined the cluster without any issues, so we continue to add the instance-store instances, and soon achieving our goal of creating a 10,000 node cluster:

A few things more findings:
  • We kept sending spot requests to the us-east-1 region until "capacity-not-able" was returned to us. We were expecting the bid price to go sky-high when an instance type ran out but that did not happen.
  • When a certain instance-type gets mostly used up, further spot requests for the instance type get slower and slower.
  • At peak rate, we were able to provision over 2,000 nodes in less than 30 minutes. In total, we spent less than 6 hours constructing, debugging the issue caused by the EBS volume limit, running a small number of Grid Engine tests, and taking down the cluster.
  • Instance boot time was independent of the instance type: EBS-backed c1.xlarge and t1.micro took roughly the same amount of time to boot.

HPC in the Cloud
To put a 10,000-node cluster into perspective, we can take a look at some of the recent TOP500 entries running x64 processors:
  • SuperMUC, #6 in TOP500, has 9,400 compute nodes
  • Stampede, #7 in TOP500, will have 6,000+ nodes when completed
  • Tianhe-1A, #8 in TOP500 (was #1 till June 2011), has 7,168 nodes
In fact, a quick look at the TOP500 list, we found that over 90% of the entries have less than 10,000 nodes, so in terms of the raw processing power, one can easily rent a supercomputer in the Cloud and get compatible compute power. However, some HPC workloads are very sensitive to the network (MPI) latency, and we believe dedicated HPC clusters still have an edge when running those workloads (in the end, those clusters are designed to have a low latency network). It's also worth mentioning that the Cluster Compute Instance-type with the 10GbE can reduce some of the latency.

Future work
With 10,000 nodes, embarrassingly parallel workloads can complete using 1/10 of the time compare to running on a 1,000-node cluster. Also worth mentioning is that the working storage required to place the input and output is needed for 1/10 the time as well (eg. instead of using 10TB-Month, only 1TB-Month is needed). So not only that you get the results faster, but it actually costs less due to reduced storage costs!

With 10,000 nodes and a small number of submitted jobs, the multi-threaded Grid Engine qmaster process constantly uses 1.5 to 2 cores (at most 4 cores at peak) on the master node, and around 500MB of memory. With more tuning or a more powerful instance type, we believe Grid Engine can handle at least 20,000 nodes.



Note: we are already working on enhancements that further reduce the CPU usage!

Summary

Cluster size 10,000+ slave nodes (master did not run jobs)
SGE versions Grid Engine 6.2u5 (from Sun Microsystems)
GE 2011.11 (from the Open Grid Scheduler Project)
Regions us-east-1 (ran over 75% of the instances)
us-west-1
us-west-2
Instance types On-demand
Spot
Instance sizes from c1.xlarge to t1.micro
Operating systems Ubuntu 10
Ubuntu 11
Oracle Linux 6.3 with the Unbreakable Enterprise Kernel
Other software Python, Boto

Using Grid Engine in the Xeon Phi Environment [The Open Source Grid Engine Blog]

The Intel Xeon Phi (Codename: MIC - Many Integrated Core) is an interesting HPC hardware platform. While it is not in production yet, there are already lots of porting and optimization work done for Xeon Phi. In the beginning of this year, we also started the conversation with Intel - we wanted to make sure that Open Grid Scheduler/Grid Engine works in the Xeon Phi environment as we have received requests for Xeon Phi support at SC11.

While a lot of information is still under NDA, there are already lots of papers published on the Internet, and even Intel engineers themselves have already disclosed information about the Xeon Phi architecture and programming models. For example, by now, it is widely known that Xeon Phi runs an embedded Linux OS image on the board - and in fact the users can log onto the board and use it as a multi-core Linux machine.

One Xeon Phi execution model is the more traditional offload execution model, where applications running on the host processors offload sections of code to the Xeon Phi accelerator. Note that Intel also defines the Language Extensions for Offload (LEO) compiler directives to ease porting. And yet with the standalone execution model, users execute code directly and natively on the board, and the host processors are not involved in the computation.

Open Grid Scheduler/Grid Engine can be easily configured to handle the offload execution model, as the Xeon Phi used this way is very similar to a GPU device. Grid Engine can easily schedule jobs to the hosts that have Xeon Phi boards, and Grid Engine can make sure that the hardware resources are not oversubscribed. Yet the standalone execution model is more interesting, it is the Linux OS environment that most HPC users are familar with, but it adds a level of indirection to job execution. We don't have the design finalized yet, as the software environment is not released to the public, but our plan is to support both execution models in a future version of Open Grid Scheduler/Grid Engine.

Optimizing Grid Engine for AMD Bulldozer Systems [The Open Source Grid Engine Blog]

The AMD Bulldozer series (including Piledriver, which was released recently) is very interesting from a micro-architecture point of view. A Bulldozer processor can have as many as 16-cores, and cores are further grouped into modules. With the current generation, each module contains 2 cores, so an 8-module processor has 16 cores, and a 4-module processor has 8 cores, etc.

  • The traditional SMT (eg. Intel Hyper-Threading) pretty much duplicates the register file and the processor front-end, but as most of the execution pipeline is shared between the pair of SMT threads, performance can be greatly affected when the SMT threads are competing for hardware resources.
  • For Bulldozer, only the Floating-Point Unit, L1 instruction cache, L2 cache, and a small part of the execution pipeline are shared, making resource contention a much smaller concern.
A lot of HPC clusters completely turn off SMT, as performance is the main objective for those installations. On the other hand,  Bulldozer processors are less affected by a dumb OS scheduler, but it still helps if system software understands the architecture of the processor. For example, the patched Windows scheduler that understands the AMD hardware can boost the performance by 10%.

And what does this mean for Grid Engine? The Open Grid Scheduler project implemented Grid Engine Job Binding with the hwloc library (initially for the AMD Magny-Cours Opteron 6100 series - the original PLPA library that was used by Sun Grid Engine was not able to handle newer AMD processors), and we also added Linux cpuset support (when the Grid Engine cgroups Integration is enabled and with the cpuset controller present). In both cases, the execution daemon essentially becomes the local scheduler that dispatches processes to the execution cores. With a smarter execution daemon (execd), we can speed up job execution with no changes to any application code.

(And yes, this feature will be available in the GE 2011.11 update 1 release.)

Grid Engine Cygwin Port [The Open Source Grid Engine Blog]

Open Grid Scheduler/Grid Engine will support Windows/Cygwin with the GE 2011.11u1 release. We found that many sites just need to submit jobs to the Unix/Linux compute farm from Windows workstations, so in this release only the client-side is supported under our commercial Grid Engine support program. For sites that need to run client-side Grid Engine commands (eg. qsub, qstat, qacct, qhost, qdel, etc), our Cygwin port totally fits their needs. We are satisfied with Cygwin until our true native Windows port is ready...


Running daemons under Cygwin is currently under technology preview. We've tested a small cluster of Windows execution machines, and the results look promising:



Note that this is not the first attempt to port Grid Engine to the Cygwin environment. In 2003, Andy mentioned in the Compiling Grid Engine under Cygwin thread that Sun/Gridware ported the Grid Engine CLI to Cygwin, but daemons were not supported in the original port. Ian Chesal (who worked for Altera at that time) ported the Grid Engine daemons as well, but did not release any code. In 2011 we started from scratch again, and we checked in code changes earlier this year - the majority of the code is already in the GE2011.11 patch 1 release, with the rest coming in the update 1 release.

So finally, the Cygwin port is in the source tree this time - no more out-of-tree patches.

Giving Away a Cisco Live Full Conference Pass [The Open Source Grid Engine Blog]

Back in May, we attended a local Cisco event here in Toronto. Besides talking to Cisco engineers about their datacenter products and networking technologies, we also met with some technical UCS server people (more on Cisco UCS Blade Servers & Open Grid Scheduler/Grid Engine in later blog entry).

We also received a Cisco Live Conference Pass, which allows us to attend everything at the conference (ie. the full experience) in San Diego, CA on June 10-14, 2012, and we are planning to give it to the first person who sends us the right answer to the following question:

When run with 20 MPI processes, what will the value of recvbuf[i][i] be for i=0..19 in MPI_COMM_WORLD rank 17 when this application calls MPI_Finalize?


#include <mpi.h>


int sendbuf[100];
int recvbuf[20][100];
MPI_Request reqs[40];


MPI_Request send_it(int dest, int len)
{
   int i;
   MPI_Request req;
   for (i = 0; i < len; ++i) {
       sendbuf[i] = dest;
   }
   MPI_Isend(sendbuf, len, MPI_INT, dest, 0, MPI_COMM_WORLD, &req);
   return req;
}


MPI_Request recv_it(int src, int len)
{
   MPI_Request req;
   MPI_Irecv(recvbuf[src], len, MPI_INT, src, 0, MPI_COMM_WORLD, &req);
   return req;
}


int main(int argc, char *argv[])
{
   int i, j, rank, size;
   MPI_Init(NULL, NULL);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   MPI_Comm_size(MPI_COMM_WORLD, &size);


   /* Bound the number of procs involved, just so we can be lazy and
      use a fixed-length sendbuf/recvbuf. */
   if (rank < 20) {
       for (i = j = 0; i < size; ++i) {
           reqs[j++] = send_it(i, 5);
           reqs[j++] = recv_it(i, 5);
       }
       MPI_Waitall(j, reqs, MPI_STATUSES_IGNORE);
   }


   MPI_Finalize();
   return 0;
}


The code above & the question were written by Mr Open MPI, Jeff Squyres, who has worked with us as early as the pre-Oracle Grid Engine days on PLPA, and suggested us to migrate to the hwloc topology library. (Side note: when the Open Grid Scheduler became the maintainer of the open source Grid Engine code base in 2011, Grid Engine Multi-Core Processor Binding with hwloc was one of the first major features we added in Open Grid Scheduler/Grid Engine to support discovery of newer system topologies).

So send us the answer, and the first one who answers the question correctly will get the pass to attend the conference!



Grid Engine cgroups Integration [The Open Source Grid Engine Blog]

The PDC (Portable Data Collector) in Grid Engine's job execution daemon tracks job-process membership for resource usage accounting purposes, for job control purposes (ie. making sure that jobs don't exceed their resource limits), and for signaling purposes (eg. stopping, killing jobs).


Since most operating systems don't have a mechanism to group of processes into jobs, Grid Engine adds an additional Group ID to each job. As normal processes can't change their GID membership, it is a safe way to tag processes to jobs. On operating systems where the PDC module is enabled, every so often the execution daemon scans all the processes running on the system, and then groups processes to jobs by looking for the additional GID tag.

So far so good, but...
Adding an extra GID has side-effects. We have received reports that applications behave strangely with an unresolvable GID. For example, on Ubuntu, we get:

$ qrsh
groups: cannot find name for group ID 20017

Another problem: it takes time for the PDC to warm up. For some short running jobs, you will find:

removing unreferenced job 64623.394 without job report from ptf

Third problem is that if the PDC runs too often, it takes too much CPU time. In SGE 6.2 u5, a memory accounting bug was introduced because the Grid Engine developers needed to reduce the CPU usage of the PDC on Linux by adding a workaround. (Shameless plug: we the Open Grid Scheduler developers fixed the bug back in 2010, way ahead of any other Grid Engine implementations that are still active these days.) Imagine running ps -elf every second on your execution nodes. This is how intrusive the PDC is!

The final major issue is that the PDC is not accurate. Grid Engine itself does not trust on the information from the PDC at job cleanup. The end result is run-away jobs consuming resources on the execution hosts. The cluster administrators then need to enable the special flag to tell Grid Engine to do proper job cleanup (by default ENABLE_ADDGRP_KILL is off). Quoting the Grid Engine sge_conf manpage:

ENABLE_ADDGRP_KILL
          If this parameter is set then Sun Grid Engine uses  the
          supplementary group ids (see gid_range) to identify all
          processes which are to be  terminated  when  a  job  is
          deleted,  or  when  sge_shepherd(8) cleans up after job
          termination.

Grid Engine cgroups Integration
In Grid Engine 2011.11 update 1, we switch to cgroups instead of the additional GID for the process tagging mechanism.

(We the Open Grid Scheduler / Grid Engine developers wrote the PDC code for AIX, HP-UX, and the initial PDC code for MacOS X, which is used as the base for the FreeBSD and NetBSD PDC. We even wrote a PDC prototype for Linux that does not rely on GID. Our code was contributed to Sun Microsystems, and is used in every implementation of Grid Engine - whether it is commercial, or open source, or commercial open source like Open Grid Scheduler.)

As almost half of the PDCs were developed by us, we knew all the issues in PDC.

We are switching to cgroups now but not earlier because:
  1. Most Linux distributions ship kernels that have cgroups support.
  2. We are seeing more and more cgroups improvements. Lots of cgroups performance issues were fixed in recent Linux kernels.
With the cgroups integration in Grid Engine 2011.11 update 1, all the PDC issues mentioned above are handled. Further, we have bonus features with cgroups:
  1. Accurate memory usage accounting: ie. shared pages are accounted correctly.
  2. Resource limit at the job level, not at the individual process level.
  3. Out of the box SSH integration.
  4. RSS (real memory) limit: we all have jobs that try to use every single byte of memory, but capping their RSS does not hurt their performance. May as well cap the RSS such that we can take back the spare processors for other jobs.
  5. With the cpuset cgroup controller, Grid Engine can set the processor binding and memory locality reliably. Note that jobs that change their own processor binding are not handled by the original Grid Engine Processor Binding with hwloc (Another shameless plug: we are again the first who switched to hwloc for processor binding) - it is very rare to encounter jobs that change their own processor binding, but if a job or external process decides to change its own processor mask, then this will affect other jobs running on the system.
  6. Finally, with the freezer controller, we can have a safe mechanism for stopping and resuming jobs:
$ qstat
job-ID  prior   name       user         state submit/start at
queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
    16 0.55500 sleep      sgeadmin     r     05/07/2012 05:44:12
all.q@master                       1
$ cat /cgroups/cpu_and_memory/gridengine/Job16.1/freezer.state
THAWED
$ qmod -sj 16
sgeadmin - suspended job 16
$ cat /cgroups/cpu_and_memory/gridengine/Job16.1/freezer.state
FROZEN
$ qmod -usj 16
sgeadmin - unsuspended job 16
$ cat /cgroups/cpu_and_memory/gridengine/Job16.1/freezer.state
THAWED

We will be announcing more new features in Grid Engine 2011.11 update 1 here on this blog. Stay tuned for our announcement.

Welcome, new reader! [The Adventures of Dr. McNinja]

THE ADVENTURES OF DR. MCNINJA has concluded, but I am happy you’ve found it! DO NOT JUST HIT THE BACK BUTTON ON THAT LAST STORY. Mega spoilers ahoy. Instead, go on right to the beginning and read from there, OR check out the handy NEW READER section, made specifically for your enjoyment and comfort. It’s on the top right of the page.

Happy reading!

-Christopher

Welcome, new reader! is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p147 [The Adventures of Dr. McNinja]

33p147

33p147 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p147 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

Dr. Hastings’ Final Thoughts [The Adventures of Dr. McNinja]

The first drawing of Dr. McNinja.

Thank you so much for reading my comic! It’s been very important to me, and I’m so happy to have entertained so many people. I can’t begin to scratch the depth and richness of experiences I’ve had over the course of writing and drawing this. Huge thanks to my original inker Kent Archer and to colorist Anthony Clark. Thanks to the folks at TopatoCo and Dark Horse for putting books and merch out, and helping me do this as a real job for a while. Thanks to Ryan North for being the first person to link my comic. Thanks to Klaus Janson for being the first person to teach me how to tell a story in a comic, and thanks to Walter Simonson for telling me that my writing was better than my drawing. Thanks to all the friends who have done guest comics over the years. Thanks to Jacob Vigeveno and Chason Chaffin for helping with the website. Thanks to my parents, Dan and Mitzi Hastings for encouraging me to pursue comics, which is an insane thing for parents to do. Thanks to my spouse, Carly Monardo for being the person whose laughs are my favorite, and for painting the McNinja book covers.

And a MASSIVE thanks to you, the reader! I still cannot believe how many of you there are. Please feel free to come back and reread the archives whenever you like.

If you’d like to continue to read other things I’m writing, drawing, acting in, or directing, please subscribe to my brand new newsletter which will inform you of them. At the moment, that would be the ADVENTURE TIME comic from Boom!, THE UNBELIEVABLE GWENPOOL from Marvel, and a slew of other things I can’t say yet, including more stuff from Marvel, a new creator owned book, maybe a video game or two, and a new webcomic site. Get that newsletter to find out details when I can say them. If you’re worried about spam, well I’ll tell you I hate writing newsletters. So you won’t get it often. Once a month TOPS. And yes, there are still one or two more McNinja books coming out.

If you’d like to kick me a little tip for the thousands of pages of free comics, and help keep the server lights on in the future, you may do so via Paypal, and I’d be very grateful. (You can also check out the store, as always.)

Thanks again,
-Christopher

Dr. Hastings’ Final Thoughts is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p146 [The Adventures of Dr. McNinja]

33p146

33p146 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p146 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p145 [The Adventures of Dr. McNinja]

33p145

33p145 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p145 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p144 [The Adventures of Dr. McNinja]

33p144

33p144 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p144 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p143 [The Adventures of Dr. McNinja]

33p143

33p143 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p143 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p142 [The Adventures of Dr. McNinja]

33p142

33p142 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p142 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p141 [The Adventures of Dr. McNinja]

33p141

33p141 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p141 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

33p140 [The Adventures of Dr. McNinja]

33p140

33p140 is a post from: The Adventures of Dr. McNinja Ads by Project Wonderful! Your ad could be here, right now.

33p140 is a post from: The Adventures of Dr. McNinja

Ads by Project Wonderful! Your ad could be here, right now.

A Random Painting [Edmund Finney's Quest to Find the Meaning of Life]

I realized I don’t post my side-project artwork enough. Here is one of the things.

IMG_7337

The Tonic [Edmund Finney's Quest to Find the Meaning of Life]

The Tonic

Hey all, an update is up there. I appreciate those who check back weekly for an update, but as explained before, that’s not going to be a realistic schedule for me any time in the foreseeable future. For now, I’ll just have to be one of those webcomics that just updates every now and then when the author can get around to it. As always, I’ll announce every update on Facebook, Twitter, and G+. Thanks!

Trap Door Ride [Edmund Finney's Quest to Find the Meaning of Life]

Trap Door Ride

Hey all, sorry again for the delay. Unfortunately my work and other tasks don’t allow for me to update on a consistent basis for the time being. But if you subscribe via feed, Facebook, Twitter, etc., you’ll get an update every time I post. GuarantEED.

Mustachio’ed Jones [Edmund Finney's Quest to Find the Meaning of Life]

Mustachio’ed Jones

How many of you are prepared for the inevitable with a fully functional safety closet? OH and don’t forget you can now get my books on Amazon, both paperback and Kindle!

EQComics now on Kindle and Amazon paperback! [Edmund Finney's Quest to Find the Meaning of Life]

Hey people, as previously mentioned, I’ve been working on getting my books into Kindle format for the past week or so, and it is now done!  Wherever you are, Amazon.com, Amazon.co.uk, Amazon.ca, etc., you can just go onto your Amazon and search “Edmund Finney” and it should come right up!

Not only are the Kindle versions there, but my paperback books are now there with the sweet Amazon discount  applied!

Thanks for reading! New comic will be posted around this weekend, probably Sunday. It’s bigger than usual.

 

Note I will be releasing PDF and CRZ online versions in the summertime. And maybe some entirely new books as well.

The Pattern [Edmund Finney's Quest to Find the Meaning of Life]

The Pattern

I hope everyone is having a good Valentines day weekend, you humans.

Where is Miss Crimson [Edmund Finney's Quest to Find the Meaning of Life]

Where is Miss Crimson

Hey y’all there’s a comic up there. Remember that since I don’t update consistently, the best place to keep updated is on my facebook, GPlus, twitter, and/or my RSS feed, links to which are hidden all over my website!

Mustache Collection [Edmund Finney's Quest to Find the Meaning of Life]

Mustache Collection

Hey y’all I hope everyone is having a good year thus far. For those of you in the areas recently hit by large amounts of snow, keep in mind that some of it is ghost snow. So be careful.

Deduction [Edmund Finney's Quest to Find the Meaning of Life]

Deduction

Sorry for the delay, and the not-as-refined art and lettering, as I’ve been sick for the past few days but wanted to make sure I posted a comic. It’s somewhat tough to draw and letter with a shaky hand. But hey the holidaytimes are upon us! I hope everyone is cheery!

Health Circumstances Demand a Longer, Deeper Timeout [Falkvinge on Liberty]

Personal: I ran headfirst into a bit of a classic burnout two years ago. I’m still recovering from it. I’ve been trying to maintain a presence and not make this condition show too much, but I need to scale down the rest of my presence too for a while in order to reset and recharge.

I’ve been starting and re-starting writing this post way too many times now. I’ve decided to just post it as a stream of consciousness, readable or not as it may be, rather than my usual bar of having some sort of clear red thread with step-by-step logical coherence.

Two years ago, while moving from Stockholm to Berlin, I hit the infamous brick wall. I became incapable of most work that required any form of vehicular travel — I was literally limited to walking distance. Yes, it felt as ridiculous as it sounds, but it was just a matter of accepting the lay of the land and working with it. At the time, I was able to maintain some illusion of normality while starting to wind down and recover behind the scenes, thanks to being able to work remotely. I’ve since stopped working altogether — or so I thought, at least — and focusing on recharging.

When you drive a solar-powered rover too aggressively in Kerbal Space Program and the sun goes down, the batteries deplete quickly. You can’t start driving the rover again when the sun goes up from its state of depleted batteries, not even at its rated speed; you have to wait until the batteries have recharged, even if the circumstances (i.e. shining sun) should otherwise make you able to operate nominally. This is a little bit the state I’m in: I should nominally be fine, with most of the everyday load reduced significantly, but my batteries are still not recharging at the rate I had expected them to. (Yes, I’m impatient, which is admittedly part of the problem in the first place.)

So to all people who have written to me over this past time that I haven’t responded to: Please accept my apologies. It’s not out of malice or disinterest I haven’t responded, I’m simply getting done in a month what I used to get done in a day, and even that is a marked improvement. The “need to respond” queue is silly long by now, and includes conference invites and whatnot, that I would normally have responded to within minutes. It includes pings from near friends, that I had hoped to spend a lot more time with here in Berlin, as well as distant friends.

A close friend of mine pointed at a recent study about stress, a study looking at burnout symptoms in places with very good work-to-life balance, and the study concluded that the body doesn’t make a difference between obligations for work or obligations that are felt outside of work for any other reason than money. And she’s right: I’ve been feeling a pressure to shoot video, to code open-source projects, to participate in the community. I need to, bluntly speaking, drop all of these expectations for the foreseeable future. “Go off-grid” is a little too harsh, but I’ll need to turn off the expectation heartbeat on literally everything. I’ll do random things from time to time when I have the energy and desire for it, which unfortunately won’t be most of the time.

These recoveries basically take whatever damn time they please. I could have recharged batteries in six months, in a year, in ten years. I have honestly no idea and therefore I’m not setting any expectations, in either direction.

Time for a deeper and longer break.Time for a deeper and longer break.

I’d like to say “I’ll be back”, but I don’t think the person on the other side of this recovery is going to be the same person I am today. I am sure I will still want to change the world for the better, somehow. I just can’t tell today how I’ll be wanting to change the world tomorrow. So even though I’ll very likely be back doing something, it’ll very likely not be the exact same things I’ve done up until this point.

Bitcoin, the Bitcoin Cash roadmap, and the Law of Two Feet [Falkvinge on Liberty]

Bitcoin: As the dust settles after the November 15 bitcoin upgrade, the roadmaps have been updated with the new state of the protocol and people are starting to looking ahead to the next set of features. I thought I’d take the opportunity to give my view on it.

The new set of features ahead has been published on bitcoincash.org, which is for the most part spearheaded by the Bitcoin ABC implementation, but where Bitcoin Unlimited also deserves significant credit for research and development.

Clarification: “Bitcoin” refers to Bitcoin-BCH, or Bitcoin Cash
In this post, I’m talking about the “bitcoin roadmap”. As there’s more than one bitcoin, I should clarify that I’m referring to Bitcoin-BCH, or the “Cash” version of Bitcoin, as opposed to Bitcoin-BTC, the “Blockstream” fork of bitcoin. For those familiar with the subject, this would be obvious, as the Bitcoin-BTC version doesn’t have a roadmap to scale, such as I’m describing here.

This is the current “you are here” map as of end-2018:

The Bitcoin Cash roadmap as of end-2018, as published at bitcoincash.org.The Bitcoin Cash roadmap as of end-2018, as published at bitcoincash.org.

I like this roadmap for two reasons. Or rather, for two levels of reasons.

The first is that I see bitcoin as the path to a world currency. In order to be so, it will need to carry an insanely heavier load, and because of the typical velocity of money, each bitcoin must be valued far higher than it is today — to a point where single satoshis are no longer a small unit, but represent maybe a few cents. That quanta (smallest possible discrete value) is not small enough to provide frictionless automated microtrade, which is why I’m looking forward to — and have been discreetly applauding — the fractional satoshis on the roadmap. The bigger footprint a network gets, the more inertia it takes to change something, so getting these two items in with reasonable speed is something I regard as key.

The third key item is extensibility — the ability to extend the protocol without asking permission, akin to how early browsers started supporting random new HTML markup tags left and right. This drove the standards forward and allowed for rapid feedback cycles with the user community, and something similar will be needed for permissionless innovation on top of bitcoin to really take off.

These three taken together happen to represent the final phase of the three tracks that the roadmap lists. I have some understanding that each of them have necessary prerequisites that are being filled in some sort of logical order.

This brings me to the Law of Two Feet.

You see, it doesn’t really matter what I think of a feature, whether I like it or not. I am a diehard proponent of the Law of Two Feet: It simply means that if you don’t like something, then it is your responsibility — both toward yourself and the community you don’t like — to walk to a place you do like.

(Just to be clear, the Law of Two Feet is inclusive. It also applies to people who don’t have two actual feet.)

This is what I worded as the Freedom of Initiative and the Freedom to Follow, and it is absolutely key for permissionless innovation. You don’t get that the moment somebody is trying to give somebody else permission on what road they may choose.

Each of us have the freedom to take any initiative we want.

Each of us also have the freedom to follow any initiative we like.

But no one of us may tell another what they must or may not do.

I happen to very much approve of the above roadmap from where I’m sitting. But even if I didn’t, the freedom of initiative and freedom to follow are far more important than my opinion on this particular initiative.

Pirate Party enters parliament in Luxembourg, gets 17% in Prague [Falkvinge on Liberty]

Photo by Jewel Mitchell on Unsplash

Pirate Parties: This past weekend, elections were held in Luxembourg and the Czech Republic. The Pirate Party of Luxembourg tripled their support and entered the Luxembourg Parliament with two MPs, and in the Czech Republic, the Pirate Party increased their support further – now receiving a full 17% in Prague.

With 6.45% of the votes of the final tally, the Luxembourg Pirate Party is entering its national Parliament, being the fifth Pirate Party to enter a national or supranational legislature (after Sweden, Germany, Iceland, and the Czech Republic). This may not seem like much, but it is a very big deal, for reasons I’ll elaborate on later. A big congratulations to Sven Clement and Marc Goergen, new Members of Parliament for Luxembourg!

Further, the Czech Republic has had municipal elections, and the Czech Pirate Party showed a full 17.1% support in Prague, the Czech capital, making the Pirates the second biggest party with a very narrow gap to the first place (at 17.9%). This may or may not translate to votes for the Czech national legislature, but is nevertheless the highest score recorded so far for a Pirate Party election day. I understand the Czech Pirates have as many as 275 (two hundred and seventy-five!) newly-elected members of city councils, up from 21 (twenty-one). Well done, well done indeed!

For people in a winner-takes-all system, like the UK or United States, this may sound like a mediocre result. In those countries, there are usually only two parties, and the loser with 49% of the vote gets nothing. However, most of Europe have so-called proportional systems, where 5% of the nationwide votes gives you 5% of the national legislation seats. In these systems, the parties elected to Parliament negotiate between themselves to find a ruling majority coalition of 51%+ of the seats, trying to negotiate common positions between parties that are reasonably close to each other in policy. This usually requires a few weeks of intense negotiations between the elections and the presentation of a successfully negotiated majority coalition.

Further, it could reasonable be asked what kind of difference the Czech Republic or Luxembourg could possibly make on their own in the global information repression. The answer is, a whole lot. The key here is realizing that one country is sufficient to break the global repression of information; the repression is completely dependent on every single country keeping watertight doors. If one single country decides to allow the free movement of culture and knowledge, then all such distribution will immediately be based there. The copyright industry lobby in other countries will protest, quite loudly, but there’s not really anything they can do about it.

And since the problem from a policymaking standpoint has been that the industry-age era politicians consider the Internet-related policy areas completely peripheral in the first place, conceding those policy areas will be seen as very cheap price to bind those votes to a majority coalition.

“One country is sufficient to break the global repression of information.”

A relevant comparison is how Canada has now legalized cannabis at the country level, following many state-level initiatives here and there in the world, and at once, the floodgates are open. Not just for the illegal distribution networks, but more importantly, for legalization everywhere else. As a German politician dryly said today, “what’s possible in Canada is also possible in Germany”, proposing that cannabis should be legalized outright in Germany. I would imagine the tone is similar in most places — or, importantly, many enough places.

The Luxembourg and Prague coalition talks have just started, with an outcome typically expected in a few weeks.

Analog Equivalent Rights (21/21): Conclusion, privacy has been all but eliminated from the digital environment [Falkvinge on Liberty]

Privacy: In a series of posts on this blog, we have shown how practically everything our parents took for granted with regards to privacy has been completely eliminated for our children, just because they use digital tools instead of analog, and the people interpreting the laws are saying that privacy only applies to the old, analog environment of our parents.

Once you agree with the observation that privacy seems to simply not apply for our children, merely for living in a digitally-powered environment instead of our parents’ analog-powered one, surprise turns to shock turns to anger, and it’s easy to want to assign blame to someone for essentially erasing five generations’ fight for civil liberties while people were looking the other way.

So whose fault is it, then?

It’s more than one actor at work here, but part of the blame must be assigned to the illusion that that nothing has changed, just because our digital children can use old-fashioned and obsolete technology to obtain the rights they should always have by law and constitution, regardless of which method they use to talk to friends and exercise their privacy rights.

We’ve all heard these excuses.

“You still have privacy of correspondence, just use the old analog letter”. As if the Internet generation would. You might as well tell our analog parents that they would need to send a wired telegram to enjoy some basic rights.

“You can still use a library freely.” Well, only an analog one, not a digital one like The Pirate Bay, which differs from an analog library only in efficiency, and not in anything else.

“You can still discuss anything you like.” Yes, but only in the analog streets and squares, not in the digital streets and squares.

“You can still date someone without the government knowing your dating preferences.” Only if I prefer to date like our parents did, in the unsafe analog world, as opposed to the safe digital environment where predators vanish at the click of a “block” button, an option our analog parents didn’t have in shady bars.

The laws aren’t different for the analog and the digital. The law doesn’t make a difference between analog and digital. But no law is above the people who interpret it in the courts, and the way people interpret those laws means the privacy rights always apply to the analog world, but never to the digital world.

It’s not rocket science to demand the same laws to apply offline and online. This includes copyright law, as well as the fact that privacy of correspondence takes precedence over copyright law (in other words, you’re not allowed to open and examine private correspondence for infringements in the analog world, not without prior and individual warrants — our law books are full of these checks and balances; they should apply in the digital too, but don’t today).

Going back to blame, that’s one actor right there: the copyright industry. They have successfully argued that their monopoly laws should apply online just as it does offline, and in doing so, has completely ignored all the checks and balances that apply to the copyright monopoly laws in the analog world. And since copying movies and music has now moved into the same communications channels as we use for private correspondence, the copyright monopoly as such has become fundamentally incompatible with private correspondence at the conceptual level.

The copyright industry has been aware of this conflict and has been continuously pushing for eroded and eliminated privacy to prop up their crumbling and obsolete monopolies, such as pushing for the hated (and now court-axed) Data Retention Directive in Europe. They would use this federal law (or European equivalent thereof) to literally get more powers than the Police themselves in pursuing individual people who were simply sharing music and movies, sharing in the way everybody does.

There are two other major factors at work. The second factor is marketing. The reason we’re tracked at the sub-footstep level in airports and other busy commercial centers is simply to sell us more crap we don’t need. This comes at the expense of privacy that our analog parents took for granted. Don’t even get started on Facebook and Google.

Last but not least are the surveillance hawks — the politicians who want to look “Tough on Crime”, or “Tough on Terrorism”, or whatever the word of choice is this week. These were the ones who pushed the Data Retention Directive into law. The copyright industry were the ones who basically wrote it for them.

These three factors have working together, and they’ve been very busy.

It’s going to be a long uphill battle to win back the liberties that were slowly won by our ancestors over about six generations, and which have been all but abolished in a decade.

It’s not rocket science that our children should have at least the same set of civil liberties in their digital environment, as our parents had in their analog environment. And yet, this is not happening.

Our children are right to demand Analog Equivalent Privacy Rights — the civil liberties our parents not just enjoyed, but took for granted.

I fear the failure to pass on the civil liberties from our parents to our children is going to be seen as the greatest failure of this particular current generation, regardless of all the good we also accomplish. Surveillance societies can be erected in just ten years, but can take centuries to roll back.

Privacy remains your own responsibility today. We all need to take it back merely by exercising our privacy rights, with whatever tools are at our disposal.

Image from the movie “Nineteen-Eighty Four”; used under fair use for political commentary.

Analog Equivalent Rights (20/21): Your analog boss couldn’t read your mail, ever [Falkvinge on Liberty]

Europe: Slack has updated its Terms of Service to let your manager read your private conversations in private channels. Our analog parents would have been shocked and horrified at the very idea that their bosses would open packages and read personal messages that were addressed to them. For our digital children, it’s another shrugworthy part of everyday life.

The analog plain old telephone system, sometimes abbreviated POTS, is a good template for how things should be even in the digital world. This is something that lawmakers got mostly right in the old analog world.

When somebody is on a phonecall — an old-fashioned, analog phonecall — we know that the conversation is private by default. It doesn’t matter who owns the phone. It is the person using the phone, right this very minute, that has all the rights to its communication capabilities, right this very minute.

The user has all the usage rights. The owner has no right to intercept or interfere with the communications usage, just based on the property right alone.

Put another way: just because you own a piece of communications equipment, that doesn’t give you any kind of automatic right to listen to private conversations that happen to come across this equipment.

Regrettably, this only applies to the telephone network. Moreover, only the analog part of the telephone network. If anything is even remotely digital, the owner can basically intercept anything they like, for any reason they like.

This particularly extends to the workplace. It can be argued that you have no expectation of privacy for what you do on your employer’s equipment; this is precisely forgetting that such privacy was paramount for the POTS, less than two decades ago, regardless of who owned the equipment.

Some employers even install wildcard digital certificates on their workplace computers with the specific purpose of negating any end-to-end security between the employee’s computer and the outside world, effectively performing a so-called “man-in-the-middle attack”. In a whitewashed term, this practice is called HTTPS Interception instead of “man-in-the-middle attack” when it’s performed by your employer instead of another adversary.

Since we’re looking at difference between analog and digital, and how privacy rights have vanished in the transition to digital, it’s worth looking at the code of law for the oldest of analog correspondences: the analog letter, and whether your boss could open and read it just because it was addressed to you at your workplace.

Analog law differs somewhat between different countries on this issue, but in general, even if your manager or workplace were allowed to open your mail (which is the case in the United States but not in Britain), they are typically never allowed to read it (even in the United States).

In contrast, with electronic mail, your managers don’t just read your entire e-mail, but typically has hired an entire department to read it for them. In Europe, this went as far as the European Court of Human Rights, which ruled that it’s totally fine for an employer to read the most private of correspondence, as long as the employer informs of this fact (thereby negating the default expectation of privacy).

Of course, this principle about somewhat-old-fashioned e-mail applies to any and all electronic communications now, such as Slack.

So for our digital children, the concept of “mail is private and yours, no matter if you receive it at the workplace” appears to have been irrevocably lost. This was a concept our analog parents took so for granted, they didn’t see any need to fight for it.

Today, privacy remains your own responsibility.

Analog Equivalent Rights (19/21): Telescreens in our Living Rooms [Falkvinge on Liberty]

Privacy: The dystopic stories of the 1950s said the government would install cameras in our homes, with the government listening in and watching us at all times. Those stories were all wrong, for we installed the cameras ourselves.

In the analog world of our parents, it was taken for completely granted that the government would not be watching us in our own homes. It’s so important an idea, it’s written into the very constitutions of states pretty much all around the world.

And yet, for our digital children, this rule, this bedrock, this principle is simply… ignored. Just because they their technology is digital, and not the analog technology of our parents.

There are many examples of how this has taken place, despite being utterly verboten. Perhaps the most high-profile one is the OPTIC NERVE program of the British surveillance agency GCHQ, which wiretapped video chats without the people concerned knowing about it.

Yes, this means the government was indeed looking into people’s living rooms remotely. Yes, this means they sometimes saw people in the nude. Quite a lot of “sometimes”, even.

According to summaries in The Guardian, over ten percent of the viewed conversations may have been sexually explicit, and 7.1% contained undesirable nudity.

Taste that term. Speak it out loud, to hear for yourself just how oppressive it really is. “Undesirable nudity”. The way you are described by the government, in a file about you, when looking into your private home without your permission.

When the government writes you down as having “undesirable nudity” in your own home.

There are many other examples, such as the state schools that activate school-issued webcams, or even the US government outright admitting it’ll all your home devices against you.

It’s too hard not to think of the 1984 quote here:

The telescreen received and transmitted simultaneously. Any sound that Winston made, above the level of a very low whisper, would be picked up by it, moreover, so long as he remained within the field of vision which the metal plaque commanded, he could be seen as well as heard. There was of course no way of knowing whether you were being watched at any given moment. How often, or on what system, the Thought Police plugged in on any individual wire was guesswork. It was even conceivable that they watched everybody all the time. But at any rate they could plug in your wire whenever they wanted to. You had to live — did live, from habit that became instinct — in the assumption that every sound you made was overheard, and, except in darkness, every movement scrutinized. — From Nineteen Eighty-Four

And of course, this has already happened. The so-called “Smart TVs” from LG, Vizio, Samsung, Sony, and surely others have been found to do just this — spy on its owners. It’s arguable that the data collected only was collected by the TV manufacturer. It’s equally arguable by the police officers knocking on that manufacturer’s door that they don’t have the right to keep such data to themselves, but that the government wants in on the action, too.

There’s absolutely no reason our digital children shouldn’t enjoy the Analog Equivalent Rights of having their own home to their very selves, a right our analog parents took for granted.

Analog Equivalent Rights (18/21): Our analog parents had private conversations, both in public and at home [Falkvinge on Liberty]

Privacy: Our parents, at least in the Western world, had a right to hold private conversations face-to-face, whether out in public or in the sanctity of their home. This is all but gone for our digital children.

Not long ago, it was the thing of horror books and movies that there would actually be widespread surveillance of what you said inside your own home. Our analog parents literally had this as scary stories worthy of Halloween, mixing the horror with the utter disbelief.

“There was of course no way of knowing whether you were being surveilled at any given moment. How often, or on what system, the Thought Police plugged in on any individual device was guesswork. It was even conceivable that they listened to everybody all the time. But at any rate they could listen to you whenever they wanted to. You had to live — did live, from habit that became instinct — in the assumption that every sound you made was overheard.” — from Nineteen Eighty-Four

In the West, we prided ourselves on not being the East — the Communist East, specifically — who regarded their own citizens as suspects: suspects who needed to be cleansed of bad thoughts and bad conversations, to the degree that ordinary homes were wiretapped for ordinary conversations.

There were microphones under every café table and in every residence. And even if there weren’t in the literal sense, just there and then, they could still be anywhere, so you had to live — did live, from habit that became instinct — in the assumption that every sound you made was overheard.

“Please speak loudly and clearly into the flower pot.” — a common not-joke about the Communist societies during the Cold War

Disregard phonecalls and other remote conversations for now, since we already know them to be wiretapped across most common platforms. Let’s look at conversations in a private home.

We now have Google Echo and Amazon Alexa. And while they might have intended to keep your conversations to themselves, out of the reach of authorities, Amazon has already handed over living room recordings to authorities. In this case, permission became a moot point because the suspect gave permission. In the next case, permission might not be there, and it might happen anyway.

Mobile phones are already listening, all the time. We know because when we say “Ok Google” to an Android phone, it wakes up and listens more intensely. This, at a very minimum, means it’s always listening for the words “Ok Google”. IPhones have a similar mechanism listening for “Hey Siri”. While nominally possible to turn off, it’s one of those things you can never be sure of. And we carry these governmental surveillance microphones with us everywhere we go.

If the Snowden documents showed us anything in the general sense, it was that if a certain form of surveillance is technically possible, it is already happening.

And even if Google and Apple aren’t already listening, the German police got the green light to break into phones and plant Bundestrojaner, the flower-pot equivalent of hidden microphones, anyway. You would think that Germany of all countries has in recent memory what a bad idea this is. It could — maybe even should — be assumed that the police forces of other countries have and are already using similar tools.

For our analog parents, the concept of a private conversations was as self-evident as oxygen in the air. Our digital children may never know what one feels like.

And so we live today — from what started as a habit that has already become instinct — in the assumption that every sound we make is overheard by authorities.

Analog Equivalent Rights (17/21): The Previous Inviolability of Diaries [Falkvinge on Liberty]

Privacy: For our analog parents, a diary or a personal letter could rarely be touched by authorities, not even by law enforcement searching for evidence of a crime. Objects such as these had protection over and above the constitutional privacy safeguards. For our digital children, however, the equivalent diaries and letters aren’t even considered worthy of basic constitutional privacy.

In most jurisdictions, there is a constitutional right to privacy. Law enforcement in such countries can’t just walk in and read somebody’s mail, wiretap their phonecalls, or track their IP addresses. They need a prior court order to do so, which in turn is based on a concrete suspicion of a serious crime: the general case is that you have a right to privacy, and violations of this rule are the exception, not the norm.

However, there’s usually a layer of protection over and above this: even if and when law enforcement gets permission from a judge to violate somebody’s privacy in the form of a search warrant of their home, there are certain things that may not be touched unless specific and additional permissions are granted by the same type of judge. This class of items includes the most private of the personal: private letters, diaries, and so on.

Of course, this is only true in the analog world of our parents. Even though the letter of the law is the same, this protection doesn’t apply at all to the digital world of our children, to their diaries and letters.

Because the modern diary is kept on a computer. If not on a desktop computer, then certainly on a mobile handheld one — what we’d call a “phone” for historical reasons, but what’s really a handheld computer.

And a computer is a work tool in the analog world of our parents. There are loads of precedent cases that establish any form of electronic device as a work tool, dating back well into the analog world, and law enforcement is falling back on all of them with vigor, even now that our digital devices are holding our diaries, personal letters, and other items far more private than an analog diary was ever capable of.

That’s right: whereas your parents’ diaries were extremely protected under the law of the land, your children’s diaries — no less private to them, than those of your parents were to your parents — are as protected from search and seizure as an ordinary steel wrench in a random workshop.

So the question is how we got from point A to point B here? Why are the Police, who know that they can’t touch an analog diary during a house search, instantly grabbing mobile phones which serve the same purpose for our children?

“Because they can”, is the short answer. “Also because nobody put their foot down” for advanced points on the civics course. It’s because some people saw short term political points in being “tough on crime” and completely erasing hard-won rights in the process.

Encrypt everything.

Analog Equivalent Rights (16/21): Retroactive surveillance of all our children [Falkvinge on Liberty]

Privacy: In the analog world of our parents, it was absolutely unthinkable that the government would demand to know every footstep you took, every phonecall you made, and every message you wrote, just as a routine matter. For our digital children, government officials keep insisting on this as though it were perfectly reasonable, because terrorism, and also, our digital children may be listening to music together or watching TV together, which is illegal in the way they like to do it, because of mail-order legislation from Hollywood. To make things even worse, the surveillance is retroactive — it is logged, recorded, and kept until somebody wants all of it.

About ten years ago, a colleague of mine moved from Europe to China. He noted that among many differences, the postal service was much more tightly controlled — as in, every letter sent was written by hand onto a line in a log book, kept by the postmaster at each post office. Letter from, to whom, and the date.

At the time, three things struck me: one, how natural this was to the Chinese population, not really knowing anything else; two, how horrified and denouncing our analog parents would have been at this concept; three, and despite that, that this is exactly what our lawmaker analog parents are doing to all our digital children right now.

Or trying to do, anyway; the courts are fighting back hard.

Yes, I’m talking about Telecommunications Data Retention.

There is a saying, which mirrors the Chinese feeling of normality about this quite well: “The bullshit this generation puts up with as a temporary nuisance from deranged politicians will seem perfectly ordinary to the next generation.”

Every piece of surveillance so far in this series is amplified by several orders of magnitude by the notion that it you’re not only being watched, but that everything you do is recorded for later use against you.

This is a concept so bad, not even Nineteen-Eighty Four got it: If Winston’s telescreen missed him doing something that the regime didn’t want him to do, Winston would have been safe, because there was no recording happening; only surveillance in the moment.

If Winston Smith had had today’s surveillance regime, with recording and data retention, the regime could and would have gone back and re-examined every earlier piece of action for what they might have missed.

This horror is reality now, and it applies to every piece in this series. Our digital children aren’t just without privacy in the moment, they’re retroactively without privacy in the past, too.

(Well, this horror is a reality that comes and goes, as legislators and courts are in a tug of war. In the European Union, Data Retention was mandated in 2005 by the European Parliament, was un-mandated in 2014 by the European Court of Justice, and prohibited in 2016 by the same Court. Other jurisdictions are playing out similar games; a UK court just dealt a blow to the Data Retention there, for example.)

Privacy remains your own responsibility.

Analog Equivalent Rights (15/21): Our digital children’s conversations are muted on a per-topic basis [Falkvinge on Liberty]

Privacy: At worst, our analog parents could be prevented from meeting each other. Our digital children are prevented from talking about particular subjects, once the conversation is already happening. This is a horrifying development.

When our digital children are posting a link to The Pirate Bay somewhere on Facebook, a small window sometimes pops up saying “you have posted a link with potentially harmful content. Please refrain from posting such links.”

Yes, even in private conversations. Especially in private conversations.

This may seem like a small thing, but it is downright egregious. Our digital children are not prevented from having a conversation, per se, but are monitored for bad topics that the regime doesn’t like being discussed, and are prevented from discussing those topics. This is far worse than preventing certain people from just meeting.

The analog equivalent would be if our parents were holding an analog phone conversation, and a menacing third voice popped into the conversation with a slow voice speaking just softly enough to be perceived as threatening: “You have mentioned a prohibited subject. Please refrain from discussing prohibited subjects in the future.”

Our parents would have been horrified if this happened — and rightly so!

But in the digital world of our children, the same phenomenon is instead cheered on by the same people who would abhor it if it happened in their world, to themselves.

In this case, of course, it is any and all links to The Pirate Bay that are considered forbidden topics, under the assumption — assumption! — that they lead to manufacturing of copies that would be found in breach of the copyright monopoly in a court of law.

When I first saw the Facebook window above telling me to not discuss forbidden subjects, I was trying to distribute political material I had created myself, and used The Pirate Bay to distribute. It happens to be a very efficient way to distribute large files, which is exactly why it is being used by a lot of people for that purpose (gee, who would have thought?), including people like myself who wanted to distribute large collections of political material.

There are private communications channels, but far too few use them, and the politicians at large (yes, this includes our analog parents) are still cheering on this development, because “terrorism” and other bogeymen.

Privacy remains your own responsibility.

Analog Equivalent Rights (14/21): Our analog parents’ dating preferences weren’t tracked, recorded, and cataloged [Falkvinge on Liberty]

Privacy: Our analog parents’ dating preferences were considered a most private of matters. For our digital children, their dating preferences is a wholesale harvesting opportunity for marketing purposes. How did this terrifying shift come to be?

I believe the first big harvester of dating preferences was the innocent-looking site hotornot.com 18 years ago, a site that more seemed like the after-hours side work of a frustrated highschooler than a clever marketing ploy. It simply allowed people to rate their subjective perceived attractiveness of a photograph, and to upload photographs for such rating. (The two founders of this alleged highschool side project netted $10 million each for it when the site was sold.)

Then the scene exploded, with both user-funded and advertising-funded dating sites, all of which cataloged people’s dating preferences to the smallest detail.

Large-scale pornography sites, like PornHub, also started cataloging people’s porn preferences, and contiously make interesting infographics about geographical differences in preferences. (The link is safe for work, it’s data and maps in the form of a news story on Inverse, not on Pornhub directly.) It’s particularly interesting, as Pornhub is able to break down preferences quite specifically by age, location, gender, income brackets, and so on.

Do you know anyone who told Pornhub any of that data? No, I don’t either. And still, they are able to pinpoint who likes what with quite some precision, precision that comes from somewhere.

And then, of course, we have the social networks (which may or may not be responsible for that tracking, by the way).

It’s been reported that Facebook can tell if you’re gay or not with as little as three likes. Three. And they don’t have to be related to dating preferences or lifestyle preferences — they can be any random selections that just map up well with bigger patterns.

This is bad enough in itself, on the basis that it’s private data. At a very minimum, our digital childrens’ preferences should be their own, just like their favorite ice cream.

But a dating preferences are not just a preference like choosing your flavor of ice cream, is it? It should be, but it isn’t at this moment in time. It could also be something you’re born with. Something that people even get killed for if they’re born with the wrong preference.

It is still illegal to be born homosexual in 73 out of 192 countries, and out of these 73, eleven prescribe the death penalty for being born this way. A mere 23 out of 192 countries have full marriage equality.

Further, although the policy direction is quite one-way toward more tolerance, acceptance, and inclusion at this point in time, that doesn’t mean the policy trend can’t reverse for a number of reasons, most of them very bad. People who felt comfortable in expressing themselves can again become persecuted.

Genocide is almost always based on public data collected with benevolent intent.

This is why privacy is the last line of defense, not the first. And this last line of defense, which held fast for our analog parents, has been breached for our digital children. That matter isn’t taken nearly seriously enough.

Privacy remains your own responsibility.

Analog Equivalent Rights (13/21): Our digital children are tracked not just in everything they buy, but in what they DON’T buy [Falkvinge on Liberty]

Privacy: We’ve seen how our digital children’s privacy is violated in everything they buy with cash or credit, in a way our analog parents would have balked at. But even worse: our digital children’s privacy is also violated by tracking what they don’t buy — either actively decline or just plain walk away from.

Amazon just opened its first “Amazon Go” store, where you just pick things into a bag and leave, without ever going through a checkout process. As part of the introduction of this concept, Amazon points out that you can pick something off the shelves, at which point it’ll register in your purchase — and change your mind and put it back, at which point you’ll be registered and logged as having not purchased the item.

Sure, you’re not paying for something you changed your mind about, which is the point of the video presentation. But it’s not just about the deduction from your total amount to pay: Amazon also knows you considered buying it and eventually didn’t, and will be using that data.

Our digital children are tracked this way on a daily basis, if not an hourly basis. Our analog parents never were.

When we’re shopping for anything online, there are even simple plugins for the most common merchant solutions with the business terms “funnel analysis” — where in the so-called “purchase funnel” our digital children choose to leave the process of purchasing something — or “cart abandonment analysis”.

We can’t even simply walk away from something anymore without it being recorded, logged, and cataloged for later use against us.

But so-called “cart abandonment” is only one part of the bigger issue of tracking what we’re interested in in the age of our digital children, but didn’t buy. There is no shortage of people today who would swear they were just discussing a very specific type of product with their phone present (say, “black leather skirts”) and all of a sudden, advertising for that very specific type of product would pop up all over Facebook and/or Amazon ads. Is this really due to some company listening for keywords through the phone? Maybe, maybe not. All we know since Snowden is that if it’s technically possible to invade privacy, it is already happening.

(We have to assume here these people still need to learn how to install a simple adblocker. But still.)

At the worst ad-dense places, like (but not limited to) airports, there are eyeball trackers to find out which ads you look at. They don’t yet change to match your interests, as per Minority Report, but that’s already present on your phone and on your desktop, and so wouldn’t be foreign to see in public soon, either.

In the world of our analog parents, we weren’t registered and tracked when we bought something.

In the world of our digital children, we’re registered and tracked even when we don’t buy something.

Analog Equivalent Rights (12/21): Our parents bought things untracked, their footsteps in store weren’t recorded [Falkvinge on Liberty]

Privacy: In the last article, we focused on how people are tracked today when using credit cards instead of cash. But few pay attention to the fact that we’re tracked when using cash today, too.

Few people pay attention to the little sign on the revolving door on Schiphol Airport in Amsterdam, Netherlands. It says that wi-fi and bluetooth tracking of every single individual is taking place in the airport.

What sets Schiphol Airport apart isn’t that they track individual people’s movements to the sub-footstep level in a commercial area. (It’s for commercial purposes, not security purposes.) No, what sets Schiphol apart is that they bother to tell people about it. (The Netherlands tend to take privacy seriously, as does Germany, and for the same reason.)

Locator beacons are practically a standard in bigger commercial areas now. They ping your phone using wi-fi and bluetooth, and using signal strength triangulation, a grid of locator beacons is able to show how every single individual is moving in realtime at the sub-footstep level. This is used to “optimize marketing” — in other words, find ways to trick people’s brains to spend resources they otherwise wouldn’t have. Our own loss of privacy is being turned against us, as it always is.

Where do people stop for a while, what catches their attention, what doesn’t catch their attention, what’s a roadblock for more sales?

These are legitimate questions. However, taking away people’s privacy in order to answer those questions is not a legitimate method to answer them.

This kind of mass individual tracking has even been deployed at city levels, which happened in complete silence until the Privacy Oversight Board of a remote government sounded the alarms. The city of Västerås got the green light to continue tracking once some formal criteria were met.

Yes, this kind of people tracking is documented to have been already rolled out citywide in at least one small city in a remote part of the world (Västerås, Sweden). With the government’s Privacy Oversight Board having shrugged and said “fine, whatever”, don’t expect this to stay in the small town of Västerås. Correction, wrong tense: don’t expect it to have stayed in just Västerås, where it was greenlit three years ago.

Our analog parents had the ability to walk around untracked in the city and street of their choice, without it being used or held against them. It’s not unreasonable that our digital children should have the same ability.

There’s one other way to buy things with cash which avoids this kind of tracking, and that’s paying cash-on-delivery when ordering something online or over the phone to your door — in which case your purchase is also logged and recorded, just in another type of system.

This isn’t only used against the ordinary citizen for marketing purposes, of course. It’s used against the ordinary citizen for every conceivable purpose. But we’ll be returning to that in a later article in the series.

Privacy remains your own responsibility.

Analog Equivalent Rights (11/21): Our parents used anonymous cash [Falkvinge on Liberty]

Privacy: The anonymous cash of our analog parents is fast disappearing, and in its wake comes trackable and permissioned debit cards to our children. While convenient, it’s a wolf in sheep’s clothing.

In the last article, we looked at how our analog parents could anonymously buy a newspaper on the street corner with some coins, and read their news of choice without anybody knowing about it. This observation extends to far more than just newspapers, of course.

This ability of our parents – the ability to conduct decentralized, secure transactions anonymously – has been all but lost in a landscape that keeps pushing card payments for convenience. The convenience of not paying upfront, with credit cards; the convenience of always paying an exact amount, with debit cards; the convenience of not needing to carry and find exact amounts with every purchase. Some could even argue that having every transaction listed on a bank statement is a convenience of accounting.

But with accounting comes tracking. With tracking comes predictability and unwanted accountability.

It’s been said that a VISA executive can predict a divorce one year ahead of the parties involved, based on changes in purchase patterns. Infamously, a Target store was targeting a high school-aged woman with maternity advertising, which at first made her father furious: but as things turned out, the young woman was indeed pregnant. Target knew, and her own father didn’t.

This is because when we’re no longer using anonymous cash, every single purchase is tracked and recorded with the express intent on using it against us — whether for influencing us to make a choice to deplete our resources (“buy more”) or for punishing us for buying something we shouldn’t have, in a wide variety of conceivable ways.

China is taking the concept one step further, as has been written here before, and in what must have been the inspiration for a Black Mirror episode, is weighting its citizens’ Obedience Scores based on whether they buy useful or lavish items — useful in the views of the regime, of course.

It’s not just the fact that transactions of our digital children are logged for later use against them, in ways our analog parents could never conceive of.

It’s also that the transactions of our digital children are permissioned. When our digital children buy a bottle of water with a debit card, a transaction clears somewhere in the background. But that also means that somebody can decide to have the transaction not clear; somebody has the right to arbitrarily decide what people get to buy and not buy, if this trend continues for our digital children. That is a horrifying thought.

Our parents were using decentralized, censorship resistant, anonymous transactions in using plain cash. There is no reason our digital children should have anything less. It’s a matter of liberty and self-determination.

Privacy remains your own responsibility.

Analog Equivalent Rights (10/21): Analog journalism was protected; digital journalism isn’t [Falkvinge on Liberty]

Privacy: In the analog world of our parents, leaks to the press were heavily protected in both ends – both for the leaker and for the reporter receiving the leak. In the digital world of our children, this has been unceremoniously thrown out the window while discussing something unrelated entirely. Why aren’t our digital children afforded the same checks and balances?

Another area where privacy rights have not been carried over from the analog to the digital concerns journalism, an umbrella of different activities we consider to be an important set of checks-and-balances on power in society. When somebody handed over physical documents to a reporter, that was an analog action that was protected by federal and state laws, and sometimes even by constitutions. When somebody is handing over digital access to the same information to the same type of reporter, reflecting the way we work today and the way our children will work in the future, that is instead prosecutable at both ends.

Let us illustrate this with an example from the real world.

In the 2006 election in Sweden, there was an outcry of disastrous information hygiene on behalf of the ruling party at the time (yes, the same ruling party that later administered the worst governmental leak ever). A username and password circulated that gave full access to the innermost file servers of the Social Democratic party administration from anywhere. The username belonged to a Stig-Olof Friberg, who was using his nickname “sigge” as username, and the same “sigge” as password, and who accessed the innermost files over the Social Democratic office’s unencrypted, open, wireless network.

Calling this “bad opsec” doesn’t begin to describe it. Make a careful note to remember that these were, and still are, the institutions and people we rely on to make policy for good safeguarding of sensitive citizen data.

However, in the shadow of this, there was also the more important detail that some political reporters were well aware of the login credentials, such as one of Sweden’s most (in)famous political reporters Niklas Svensson, who had been using the credentials as a journalistic tool to gain insight into the ruling party’s workings.

This is where it gets interesting, because in the analog world, that reporter would have received leaks in the form of copied documents, physically handed over to him, and leaking to the press in this analog manner was (and still is) an extremely protected activity under law and indeed some constitutions — in Sweden, as this concerns, you can even go to prison for casually speculating over coffee at work who might have been behind a leak to the press. It is taken extremely seriously.

However, in this case, the reporter wasn’t leaked the documents, but was leaked a key for access to the digital documents — the ridiculously insecure credentials “sigge/sigge” — and was convicted in criminal court for electronic trespassing as a result, despite doing journalistic work with a clear analog protected equivalent.

It’s interesting to look at history to see how much critically important events would never have been uncovered, if this prosecution of digital journalism had been applied to analog journalism.

For one example, let’s take the COINTELPRO leak, when activists copied files from an FBI office to uncover a covert and highly illegal operation by law enforcement to discredit political organizations based solely on their political opinion. (This is not what law enforcement should be doing, speaking in general terms.) This leak happened when activists put up a note on the FBI office door on March 8, 1971 saying “Please do not lock this door tonight”, came back in the middle of the night when nobody was there, found the door unlocked as requested, and took (stole) about 1,000 classified files that revealed the illegal practices.

These were then mailed to various press outlets. The theft resulted in the exposure of some of the FBI’s most self-incriminating documents, including several documents detailing the FBI’s use of postal workers, switchboard operators, etc., in order to spy on black college students and various non-violent black activist groups, according to Wikipedia. And here’s the kicker in the context: while the people stealing the documents could and would have been indicted for doing so, it was unthinkable to charge the reporters receiving them with anything.

This is no longer the case.

Our digital children have lost the right to leak information to reporters in the way the world works today, an activity that was taken for granted — indeed, seen as crucially important to the balance of power — in the world of our digital parents. Our digital children who work as reporters can no longer safely receive leaks showing abuse of power. It is entirely reasonable that our digital children should have at least the same set of civil liberties in their digital world, as our parents had in their analog world.

Privacy remains your own responsibility.

Analog Equivalent Rights (9/21): When the government knows what news you read, in what order, and for how long [Falkvinge on Liberty]

Privacy: Our analog parents had the ability to read news anonymously, however they wanted, wherever they wanted, and whenever they wanted. For our digital children, a government agent might as well be looking over their shoulder: the government knows what news sources they read, what articles, for how long, and in what order.

For our analog parents, reading the news was an affair the government had no part of, or indeed had any business being part of. Our analog parents bought a morning newspaper with a few coins on the street corner, brought it somewhere quiet where they had a few minutes to spare, and started reading without anybody interfering.

When our digital children read the news, the government doesn’t just know what news source they choose to read, but also what specific articles they read from that news source, in what order, and for how long. So do several commercial actors. There are at least three grave issues with this.

The first is that since the government has this data, it will attempt to use this data. More specifically, it will attempt to use the data against the individual concerned, possibly in some sort of pre-crime scheme. We know this that since all data collected by a government will eventually be used against the people concerned, with mathematical certainty.

In an attention economy, data about what we pay attention to, how much, and for how long, are absolutely crucial predictive behaviors. And in the hands of a government which makes the crucial mistake of using it to predict pre-crime, the results can be disastrous for the individual and plain wrong for the government.

Of course, the instant the government uses this data in any way imaginable, positive or negative, it will become Heisenberg Metrics — the act of using the data will shape the data itself. For example, if somebody in government decides that reading about frugality probably is an indicator of poverty, and so makes people more eligible for government handouts, then such a policy will immediately shape people’s behavior to read more about frugality. Heisenberg Metrics is when a metric can’t be measured without making it invalid in the process.

(The phenomenon is named after the Heisenberg Uncertainty Principle, which is traditionally confused with the Observer Effect, which states you can’t measure some things without changing them in the process. The Heisenberg Uncertainty Principle is actually something else entirely; it states that you can’t measure precise momentum and position of a subatomic particle at the same time, and does not apply at all to Heisenberg Metrics.)

The second issue is that not only government, but also other commercial actors, will seek to act on these metrics, Heisenberg Metrics as they may be. Maybe somebody thinks that reading fanzines about motorcycle acrobatics should have an effect on your health and traffic insurance premiums?

The third issue is subtle and devious, but far more grave: the government doesn’t just know what articles you read and in what order, but as a corollary to that, knows what the last article you read was, and what you did right after reading it. In other words, it knows very precisely what piece of information leads you to stop reading and instead take a specific action. This is far more dangerous information than being aware of your general information feed patterns and preferences.

Being able to predict somebody’s actions with a high degree of certainty is a far more dangerous ability than being vaguely aware of somebody’s entertainment preferences.

Our analog parents had the privacy right of choosing their information source anonymously with nobody permitted (or able) to say what articles they read, in what order, or for what reason. It’s not unreasonable that our digital children should have the same privacy right, the analog equivalent privacy right.

Privacy remains your own responsibility.

Analog Equivalent Rights (8/21): Using Third-Party Services Should Not Void Expectation of Privacy [Falkvinge on Liberty]

Privacy: Ross Ulbricht handed in his appeal to the U.S. Supreme Court last week, highlighting an important Analog Equivalent Privacy Right in the process: Just because you’re using equipment that makes a third party aware of your circumstances, does that really nullify any expectation of privacy?

In most constitutions, there’s a protection of privacy of some kind. In the European Charter of Human Rights, this is specified as having the right to private and family life, home, and correspondence. In the U.S. Constitution, it’s framed slightly differently, but with the same outcome: it’s a ban for the government to invade privacy without good cause (“unreasonable search and seizure”).

U.S. Courts have long held, that if you have voluntarily given up some part of your digitally-stored privacy to a third party, then you can no longer expect to have privacy in that area. When looking at analog equivalence for privacy rights, this doctrine is atrocious, and in order to understand just how atrocious, we need to go back to the dawn of the manual telephone switchboards.

At the beginning of the telephone age, switchboards were fully manual. When you requested a telephone call, a manual switchboard operator would manually connect the wire from your telephone to the wire of the receiver’s telephone, and crank a mechanism that would make that telephone ring. The operators could hear every call if they wanted and knew who had been talking to whom and when.

Did you give up your privacy to a third party when using this manual telephone service? Yes, arguably, you did. Under the digital doctrine applied now, phonecalls would have no privacy at all, under any circumstance. But as we know, phonecalls are private. In fact, the phonecall operators were oathsworn to never utter the smallest part of what they learned on the job about people’s private dealings — so seriously was privacy considered, even by the companies running the switchboards.

Interestingly enough, this “third-party surrender of privacy” doctrine seems to have appeared the moment the last switchboard operator left their job for today’s automated phone-circuit switches. This was as late as 1983, just at the dawn of digital consumer-level technology such as the Commodore 64.

This false equivalence alone should be sufficient to scuttle the doctrine of “voluntarily” surrendering privacy to a third party in the digital world, and therefore giving up expectation of privacy: the equivalence in the analog world was the direct opposite.

But there’s more to the analog equivalent of third-party-service privacy. Somewhere in this concept is the notion that you’re voluntarily choosing to give up your privacy, as an active informed act — in particular, an act that stands out of the ordinary, since the Constitutions of the world are very clear that the ordinary default case is that you have an expectation of privacy.

In other words, since people’s everyday lives are covered by expectations of privacy, there must be something outside of the ordinary that a government can claim gives it the right to take away somebody’s privacy. And this “outside the ordinary” has been that the people in question were carrying a cellphone, and so “voluntarily” gave up their right to privacy, as the cellphone gives away their location to the network operator by contacting cellphone towers.

But carrying a cellphone is expected behavior today. It is completely within the boundaries of “ordinary”. In terms of expectations, this doesn’t differ much from wearing jeans or a jacket. This leads us to the question; in the thought experiment that yesterday’s jeans manufacturers had been able to pinpoint your location, had it been reasonable for the government to argue that you give up any expectation of privacy when you’re wearing jeans?

No. No, of course it hadn’t.

It’s not like you’re carrying a wilderness tracking device for the express purpose of rescue services to find you during a dangerous hike. In such a circumstance, it could be argued that you’re voluntarily carrying a locator device. But not when carrying something that everybody is expected to carry — indeed, something that everybody must carry in order to even function in today’s society.

When the only alternative to having your Constitutionally-guaranteed privacy is exile from modern society, a government should have a really thin case. Especially when the analog equivalent — analog phone switchboards — was never fair game in any case.

People deserve Analog Equivalent Privacy Rights.

Until a government recognizes this and voluntarily surrenders a power it has taken itself, which isn’t something people should hold their breath over, privacy remains your own responsibility.

Analog Equivalent Rights (7/21): Analog Libraries Were Private Searches for Information [Falkvinge on Liberty]

When our analog parents searched for information, that activity took place in libraries, and that was one of the most safeguarded privacies of all. When our digital children search for information, their innermost thoughts are instead harvested wholesale for marketing. How did this happen?

If you’re looking at one particular profession of the analog world that was absolutely obsessed with the privacy of its patrons, it was the librarians. Libraries were where people could search for their darkest secrets, were it literature, science, shopping, or something else. The secrecy of libraries were downright legendary.

As bomb recipes started appearing on the proto-Internet in the 1980s — on so-called BBSes — and some politicians tried to play on moral panics, many of common sense were quick to point out, that these “text files with bomb recipes” were no different than what you would find in the chemistry section of a mediocre-or-better library — and libraries were sacred. There was no moral panic to play on as soon as you pointed out that this was already available in every public library, for the public to access anonymously

So private were libraries, in fact, that librarians were in collective outrage when the FBI started asking libraries for records of who had borrowed what book – and that’s how the infamous warrant canaries were invented. Yup, by a librarian, protecting the patrons of the library. Librarians have always been the profession defending privacy rights the hardest – in the analog as well as the digital.

In the analog world of our parents, their Freedom of Information was sacramount: their innermost thirst for learning, knowledge, and understanding. In the digital world of our children, their corresponding innermost thoughts are instead harvested wholesale and sold off to market trinkets into their faces.

It’s not just what our digital children successfully studied that’s up for grabs. In the terms of our analog parents, it’s what they ever went to the library for. It’s what they ever considered going to the library for. In the world of our digital children, everything they searched for is recorded — and everything they thought of searching for but didn’t.

Think about that for a moment: something that was so sacred for our analog parents that entire classes of professions would go on strike to preserve it, is now casually used for wholesale marketing in the world of our digital children.

Combine this with the previous article about everything you do, say, and think being recorded for later use against you, and we’re going to need a major change in thinking on this very soon.

There is no reason our children should have less Freedom of Information just because they happen to live in a digital environment, as compared to the analog environment of our parents. There is no reason our digital children shouldn’t enjoy Analog Equivalent Privacy Rights.

Of course, it can be argued that the Internet search engines are private services who are free to offer whatever services they like on whatever terms they like. But there were private libraries in the analog world of our parents, too. We’ll be returning to this “it’s private so you don’t have a say” concept a little later in this series.

Privacy remains your own responsibility.

Analog Equivalent Rights (6/21): Everything you do, say, or think today will be used against you in the future [Falkvinge on Liberty]

Privacy: “Everything you say or do can and will be used against you, at any point in the far future when the context and agreeableness of what you said or did has changed dramatically.” With the analog surveillance of our parents, everything was caught in the context of its time. The digital surveillance of our children saves everything for later use against them.

It’s a reality for our digital children so horrible, that not even Nineteen Eighty-Four managed to think of it. In the analog surveillance world, where people are put under surveillance only after they’ve been identified as suspects of a crime, everything we said and did was transient. If Winston’s telescreen missed him doing something bad, then it had missed the moment and Winston was safe.

The analog surveillance was transient for two reasons: one, it was assumed that all surveillance was people watching other people, and two, that nobody would have the capacity of instantly finding keywords in the past twenty years of somebody’s conversations. In the analog world of our parents, that would mean somebody would need to actually listen to twenty years’ worth of tape recordings, which would in turn take sixty years (as we only work 8 out of 24 hours). In the digital world of our children, surveillance agencies type a few words to get automatic transcripts of the saved-forever surveillance-of-everybody up on screen in realtime as they type the keywords – not just from one person’s conversation, but from everybody’s. (This isn’t even exaggerating; this was reality in or about 2010 with the GCHQ-NSA XKEYSCORE program.)

In the world of our analog parents, surveillance was only a thing at the specific time it was active, which was when you were under individual and concrete suspicion of a specific, already-committed, and serious crime.

In the world of our digital children, surveillance can be retroactively activated for any reason or no reason, with the net effect that everybody is under surveillance for everything they have ever done or said.

We should tell people as it has become instead; “anything you say or do can be used against you, for any reason or no reason, at any point in the future”.

The current generation has utterly failed to preserve the presumption of innocence, as it applies to surveillance, in the shift from our analog parents to our digital children.

This subtle addition – that everything is recorded for later use against you – amplifies the horrors of the previous aspects of surveillance by orders of magnitude.

Consider somebody asking you where you were on the evening of March 13, 1992. You would, at best, have a vague idea of what you did that year. (“Let’s see… I remember my military service started on March 3 of that year… and the first week was a tough boot camp in freezing winter forest… so I was probably… back at barracks after the first week, having the first military theory class of something? Or maybe that date was a Saturday or Sunday, in which case I’d be on weekend leave?” That’s about the maximum precision your memory can produce for twenty-five years past.)

However, when confronted with hard data on what you did, the people confronting you will have an utter and complete upper hand, because you simply can’t refute it. “You were in this room and said these words, according to our data transcript. These other people were also in the same room. We have to assume what you said was communicated with the intention for them to hear. What do you have to say for yourself?”

It doesn’t have to be 25 years ago. A few months back would be sufficient for most memories to be not very detailed anymore.

To illustrate further: consider that the NSA is known to store copies even of all encrypted correspondence today, on the assumption that even if it’s not breakable today, it will probably be so in the future. Consider what you’re communicating encrypted today — in text, voice, or video — can be used against you in twenty years. You probably don’t even know half of it, because the window of acceptable behavior will have shifted in ways we cannot predict, as it always does. In the 1950s, it was completely socially acceptable to drop disparaging remarks about some minorities in society, which would socially ostracize you today. Other minorities are still okay to disparage, but might not be in the future.

When you’re listening to somebody talking from fifty years ago, they were talking in the context of their time, maybe even with the best of intentions by today’s standards. Yet, we could judge them harshly for their words interpreted by today’s context — today’s completely different context.

Our digital children will face exactly this scenario, because everything they do and say can and will be used against them, at any point in the future. It should not be this way. They should have every right to enjoy Analog Equivalent Privacy Rights.

Analog Equivalent Rights (5/21): Where did Freedom of Assembly go? [Falkvinge on Liberty]

Privacy: Our analog parents had the right to meet whomever they liked, wherever they liked, and discuss whatever they liked, without the government knowing. Our digital children have lost this, just because they use more modern items.

For a lot of our digital children’s activities, there’s no such thing as privacy anymore, as they naturally take place on the net. For people born 1980 and later, it doesn’t make sense to talk of “offline” or “online” activities. What older people see as “people spending time with their phone or computer”, younger see as socializing using their phone or computer.

This is an important distinction that the older generation tends to not understand.

Perhaps this is best illustrated with an anecdote from the previous generation again: The parents of our parents complained that our parents were talking with the phone, and not to another person using the phone. What our parents saw as socializing (using an old analog landline phone), their parents in turn saw as obsession with a device. There’s nothing new under the sun.

(Note: when I say “digital children” here, I am not referring to children as in young people below majority age; I am referring to the next generation of fully capable adult professionals.)

This digital socializing, however, can be limited, it can be… permissioned. As in, requiring somebody’s permission to socialize in the way you and your friends want, or even to socialize at all. The network effects are strong and create centralizing pressure toward a few platforms where everybody hang out, and as these are private services, they get to set any terms and conditions they like for people assembling and socializing – for the billions of people assembling and socializing there.

Just as one example to illustrate this: Facebook is using American values for socializing, not universal values. Being super-against anything even slightly naked while being comparatively accepting of hate speech is not something inherently global; it is strictly American. If Facebook had been developed in France or Germany instead of the US, any and all nudity would be welcomed as art and free-body culture (Freikörperkultur) and a completely legitimate way of socializing, but the slightest genocide questioning would lead to an insta-kickban and reporting to authorities for criminal prosecution.

Therefore, just using the dominant Facebook as an example, any non-American way of socializing is effectively banned worldwide, and it’s likely that people developing and working with Facebook aren’t even aware of this. But the Freedom of Assembly hasn’t just been limited in the online sphere, but also in the classic analog offline world where our analog parents used to hang out (and still do).

Since people’s locations are tracked, as we saw in the previous post, it is possible to match locations between individuals and figure out who was talking to whom, as well as when and where this happened, even if they were only talking face to face. As I’m looking out my window from the office writing this piece, it just so happens that I’m looking at the old Stasi headquarters across from Alexanderplatz in former East Berlin. It was a little bit like Hotel California; people who checked in there tended to never leave. Stasi also tracked who was talking to whom, but required a ton of people to perform this task manually, just in order to walk behind other people and photograph whom they were talking to — and therefore, there was an economic limit to how many people could be tracked like this at any one time before the national economy couldn’t sustain more surveillance. Today, that limit is completely gone, and everybody is tracked all the time.

Do you really have Freedom of Assembly, when the fact that you’ve associated with a person — indeed, maybe just spent time in their physical proximity — can be held against you?

I’m going to illustrate this with an example. In a major leak recently, it doesn’t matter which one, a distant colleague of mine happened to celebrate a big event with a huge party in near physical proximity to where the documents were being copied at the same time, completely unaware and by sheer coincidence. Months later, this colleague was part of journalistically vetting those leaked documents and verifying their veracity, while at this time still unaware of the source and that they had held a big party very close to the origin of the documents.

The government was very aware of the physical proximity of the leak combined with this person’s journalistic access to the documents, though, and issued not one but two arrest-on-sight warrants for this distant colleague based on that coincidence. They are now living in exile outside of Sweden, and don’t expect to be able to return home anytime soon.

Privacy, including Privacy of Location, remains your own responsibility.

Analog Equivalent Rights (4/21): Our children have lost the Privacy of Location [Falkvinge on Liberty]

Privacy: In the analog world of our parents, as an ordinary citizen and not under surveillance because of being a suspect of a crime, it was taken for granted that you could walk around a city without authorities tracking you at the footstep level. Our children don’t have this right anymore in their digital world.

Not even the dystopias of the 1950s — Nineteen Eighty-Four, Brave New World, Colossus, and so on, managed to dream up the horrors of this element: the fact that every citizen is now carrying a governmental tracking device. They’re not just carrying one, they even bought it themselves. Not even Brave New World could have imagined this horror.

It started out innocently, of course. It always does. With the new “portable phones” — which, at this point, meant something like “not chained to the floor” — authorities discovered that people would still call the Emergency Services number (112, 911, et cetera) from their mobile phones, but not always be capable of giving their location themselves, something that the phone network was now capable of doing. So authorities mandated that the phone networks be technically capable of always giving a subscriber’s location, just in case they would call Emergency Services. In the United States, this was known as the E911 regulation (“Enhanced 9-1-1”).

This was in 2005. Things went bad very quickly from there. Imagine that just 12 years ago, we still had the right to roam around freely without authorities being capable of tracking our every footstep – this was no more than just over a decade ago!

Before this point, governments supplied you with services so that you would be able to know your location, as had been the tradition since the naval lighthouse, but not so that they would be able to know your location. There’s a crucial difference here. And as always, the first breach was one of providing citizen services — in this case, emergency medical services — that only the most prescient dystopians would oppose.

What’s happened since?

Entire cities are using wi-fi passive tracking to track people at the individual, realtime, and sub-footstep level in the entire city center.

Train stations and airports, which used to be safe havens of anonymity in the analog world of our parents, have signs saying they employ realtime passive wi-fi and bluetooth tracking of everybody even coming close, and are connecting their tracking to personal identifying data. Correction: they have signs about it in the best case but do it regardless.

People’s location are tracked in at least three different… not ways, but categories of ways:

Active: You carry a sensor of your location (GPS sensor, Glonass receiver, cell tower triangulator, or even visual identifier through the camera). You use the sensors to find your location, at one point in time or continuously. The government takes itself the right to read the contents of your active sensors.

Passive: You take no action, but are still transmitting your location to the government continuously through a third party. In this category, we find cell tower triangulation as well as passive wi-fi and bluetooth tracking that require no action on behalf of a user’s phone other than being on.

Hybrid: The government finds your location in occasional pings through active dragnets and ongoing technical fishing expeditions. This would not only include cellphone-related techniques, but also face recognition connected to urban CCTV networks.

Privacy of location is one of the Seven Privacies, and we can calmly say that without active countermeasures, it’s been completely lost in the transition from analog to digital. Our parents had privacy of location, especially in busy places like airports and train stations. Our children don’t have privacy of location, not in general, and particularly not in places like airports and train stations that were the safest havens of our analog parents.

How do we reinstate Privacy of Location today? It was taken for granted just 12 years ago.

Analog Equivalent Rights (3/21): Posting an Anonymous Public Message [Falkvinge on Liberty]

Privacy: The liberties of our parents are not being inherited by our children – they are being lost wholesale in the transition to digital. Today, we’ll look at the importance of posting anonymous public messages.

When I was in my teens, before the Internet (yes, really), there was something called BBSes – Bulletin Board Systems. They were digital equivalents of an analog Bulletin Board, which in turn was a glorified sheet of wood intended for posting messages to the public. In a sense, they were an anonymous equivalent of today’s webforum software, but you connected from your home computer directly to the BBS over a phone line, without connecting to the Internet first.

The analog Bulletin Boards are still in existence, of course, but mostly used for concert promotions and the occasional fringe political or religious announcement.

In the early 1990s, weird laws were coming into effect worldwide as a result of lobbying from the copyright industry: the owners of bulletin board systems could be held liable for what other people posted on them. The only way to avoid liability was to take down the post within seven days. Such liability had no analog equivalent at all; it was an outright ridiculous idea that the owner of a piece of land should be held responsible for a poster put up on a tree on that land, or even that the owner of a public piece of cardboard could be sued for the posters other people had glued up on that board.

Let’s take that again: it is extremely weird from a legal standpoint that an electronic hosting provider is in any way, shape, or form liable for the contents hosted on their platform. It has no analog equivalent whatsoever.

Sure, people could put up illegal analog posters on an analog bulletin board. That would be an illegal act. When that happened, it was the problem of law enforcement, and never of the bulletin board owner. The thought is ridiculous and has no place in the digital landscape either.

The proper digital equivalent isn’t to require logging to hand over upload IPs to law enforcement, either. An analog bulletin board owner is under no obligation whatsoever to somehow identify the people using the bulletin board, or even monitor whether it’s being used at all.

The Analog Equivalent Privacy Right for an electronic post hosting provider is for an uploader to be responsible for everything they upload for the public to see, with no liability at all for the hosting provider under any circumstance, including no requirement to log upload data to help law enforcement find an uploader. Such monitoring is not a requirement in the analog world of our parents, nor is there an analog liability for anything posted, and there is no reason to have it otherwise in the digital world of our children just because somebody doesn’t know how to run a business otherwise.

As a side note, the United States would not exist had today’s hosting liability laws in place when it formed. A lot of writing was being circulated at the time arguing for breaking with the British Crown and forming an Independent Republic; from a criminal standpoint, this was inciting and abetting high treason. This writing was commonly nailed to trees and public posts, for the public to read and make up their own minds. Imagine for a moment if the landowners where such trees happened to stand had been charged with high treason for “hosting content” — the thought is as ridiculous in the analog would, as it really is in the digital too. We just need to pull the illusion aside, that the current laws on digital hosting make any kind of sense. These laws really are as ridiculous in the digital world of our children, as they would have been in the analog world of our parents.

Privacy remains your own responsibility.

Analog Equivalent Rights (2/21): The analog, anonymous letter and The Pirate Bay [Falkvinge on Liberty]

Privacy: Our parents were taking liberties for granted in their analog world, liberties that are not passed down to our children in the transition to digital — such as the simple right to send an anonymous letter.

Sometimes when speaking, I ask the audience how many would be okay with sites like The Pirate Bay, even if it means that artists are losing money from their operation. (Do note that this assertion is disputed: I’m asking the question on the basis of what-if the assertion is true.) Some people raise their hands, the proportion varying with audience and venue.

The copyright industry asserts that the offline laws don’t apply on the Internet when they want to sue and prosecute people sharing knowledge and culture. They’re right, but not in the way they think. They’re right that copyright law does apply online as well. But privacy laws don’t, and they should.

In the offline world, an analog letter was given a certain level of protection. This was not intended to cover just the physical letter as such, but correspondence in general; it was just that the letter was the only form of such correspondence when these liberties were drafted.

First, the letter was anonymous. It was your prerogative entirely whether you identified yourself as sender of the letter on the outside of the envelope, on the inside of the letter (so not even the postal service knew who sent it, only the recipient), or not at all.

Further, the letter was untracked in transit. The only governments tracking people’s correspondence were those we looked down on with enormous contempt.

Third, the letter was secret. The envelope would never we broken in transit.

Fourth, the carrier was never responsible for the contents, of nothing else for the simple reason they were not allowed to examine the content in the first place. But even if they could, like with a envelopeless postcard, they were never liable for executing their courier duties — this principle, the courier immunity or messenger immunity, is a principle that dates as far back as the Roman Empire.

These principles, the liberties of correspondence, should apply to offline correspondence (the letter) just as it should to online correspondence. But it doesn’t. You don’t have the right to send anything you like to anybody you like online, because it might be a copyright infringement — even though our parents had exactly this right in their offline world.

So the copyright industry is right – sending a copied drawing in a letter is a copyright infringement, and sending a copied piece of music over the net is the same kind of copyright infringement. But offline, there are checks and balances to these laws – even though it’s a copyright infringement, nobody is allowed to open the letter in transit just to see if it violates the law, because the secrecy of private correspondence is considered more important than discovering copyright infringements. This is key. This set of checks and balances has not been carried over into the digital environment.

The only time a letter is opened and prevented is when somebody is under individual and prior suspicion of a serious crime. The words “individual” and “prior” are important here — opening letters just to see if they contain a non-serious crime in progress, like copyright infringement, is simply not permitted in the slightest.

There is no reason for the offline liberties of our parents to not be carried over into the same online liberties for our children, regardless of whether that means somebody doesn’t know how to run a business anymore.

After highlighting these points, I repeat the question whether the audience would be okay with sites like The Pirate Bay, even if it means an artist is losing income. And after making these points, basically everybody raises their hand to say they would be fine with it; they would be fine with our children having the same liberty as our parents, and the checks and balances of the offline world to also apply online.

Next in the series, we’re going to look at a related topic – public anonymous announcements and the important role the city square soapbox filled in shaping liberty.

Privacy remains your own responsibility.

Analog Equivalent Privacy Rights: Our children should have the same rights as our parents [Falkvinge on Liberty]

Privacy: In a series of 21 posts on this blog, we’ll examine how privacy rights — essential civil liberties — have been completely lost in the transition to digital. The erosion is nothing short of catastrophic.

In a series of posts on this blog, we will take a look at a large amount of different areas, where privacy has simply vanished in the transition to digital, and where it ended up instead. For each of the policy areas, we’ll take a look at where different jurisdictions stand and where the trends are pulling. The key takeaway is clear — it’s not the slightest bit unreasonable that our children should have at least the same set of civil liberties and our parents, and today, they don’t. They don’t at all.

To kick off, we'll be looking at the liberties around the analog letter, and how many liberties around it — such as the taken-for-granted right to send an anonymous letter — has been completely lost. Same thing with anonymous public posters on billboards; who defends your right to make an anonymous political statement today?

We’ll be looking at how you no longer have the right to walk about in private, without somebody tracking you. It used to be a thing that airports and train stations were safe anonymous places for our parents; today, your phone is a realtime tracking beacon as soon as you approach them.

Further, we’ll take a look at how it used to be that authorities would need to catch you in the act doing something they didn’t like, but are now capable of rewinding the records 20 years or so to find something they missed when it happened, and maybe didn’t even care about then, perhaps something you didn’t even pay attention to at the time either, and much less remember 20 years later.

Our parents went to libraries and searched for information. The librarians went to extreme lengths, even inventing the warrant canary, to make sure people could search for whatever information they wanted and read whatever books they wanted without authorities knowing about it. Today, Google goes to the same extreme lengths, but to make note of everything you search for, up until and including what you almost search for but didn’t — and of course, all of it is available to authorities and governments, who only have to tell Google to follow the law they just wrote.

It is not the slightest bit unreasonable to demand that our children should have at least as much civil liberties — privacy rights — in their digital environment, as our parents had in their analog environment. Yet, the privacy rights have been almost abolished in the transition to digital.

Speaking of reading, our parents could buy a newspaper on the corner with some change. They would read a newspaper without anybody neither knowing that they bought or read it. As opposed to our children, where it is carefully logged which newspapers they read, when, what articles, in what order, and for how long – and perhaps worst, what action they took right afterward, and whether it looked caused by reading the last article they read.

Ah yes, cash at the newsstand. Cash anywhere, in fact. Several countries are trying to abolish cash, making all transactions traceable. A card is more convenient? Maybe. But it’s not more safe. Every purchase is logged. Worse, every almost-purchase of our children is also logged, something that would be inconceivable in the world of our parents. Even worse, every purchase is also permissioned, and can be denied by a third party.

Our parents didn’t have videocalls, or TVs looking back at them. But if they had, I’m reasonably sure they would have been horrified at our children having governments look straight into their living room, or watching them have private video calls, including very private video calls.

When our parents had a private conversation on the phone, there was never a stranger’s voice popping into the call and saying “you have mentioned a prohibited subject; please refrain from discussing prohibited subjects in the future”. This happens in private messaging in Facebook in the world of our children. This, of course, ties into the concept of having private conversations in our home, and how our children won’t even understand the concept of having a private conversation at home (but do understand that they can ask the little listening box for cookies and a dollhouse).

We’ll also look at how the copyright industry exploits pretty much all of this to attempt changing the world dramatically, in what can only be described as morally bankrupt.

This and much more in the coming series of 21 articles, of which this is the first.

Once again: Privacy promises from a company are worth nothing, because companies can’t promise anything [Falkvinge on Liberty]

Global: In the last post, I recalled that the only thing that matter whether data collection is taking place is whether it's technically possible, and that if you carry an electronic sensor, you must assume it to be active. Here's why it doesn't matter one bit if the sensor was made with "good guys" with exemplary and outstanding Terms and Conditions.

If data collection is possible, it is happening, and it will be used against the person it was collected from. That’s a reality which is provable with mathematical precision: the probability for data being collected is nonzero, and the probability for it being used against its owner is also some nonzero probability. Since neither of these probabilities are falling over time, then they will take place, with mathematical certainty. Therefore, the only way to have data not used against you is to make sure it’s not possible to collect it in the first place.

I hear a lot of people looking at “good guy” companies, and how they are standing up for privacy, so you can trust them with certainty. This is good, but it is not enough: a company can not just get a new management, it is also completely at the mercy of the government it is operating under.

In effect, a company does not even have agency to promise to protect any collected data. A few case studies:

In the Terms of Service of Dropbox, it was first stated that the files are encrypted, and that Dropbox employees are incapable of accessing your data. At some point, Dropbox mentioned that they’re doing server-side deduplication to store space. This is a compression technique where similar segments of files are only stored once. When this was mentioned, bright minds immediately realized that deduplication cannot take place unless Dropbox can determine that the files are similar, in which case they cannot be encrypted when this process happens. After an uproar, Dropbox changed its terms of service from employees being “incapable” of accessing client data, to employees being “not permitted” to access client data — which is an enormous difference, because it means the data is accessible to somebody walking into Dropbox offices and, say, flashing a badge. “Not permitted” counts for absolutely nothing.

Another case in point is Amazon Alexa, which is listening into your living room (just like a lot of other devices do). Amazon had promised to never share anything it heard in your home, promising you privacy. This promise was only valid up until a District Attorney wanted those recordings as part of an ongoing investigation, at which point Amazon’s promises were completely null and void.

The only way to make sure that your privacy is kept intact is to not have your data collected in the first place. Companies, even when they promise you privacy, have no legal right to promise you anything — for the very next day, the government can walk into the company’s offices and carry that data out with it. Therefore, reading Privacy Policies or Terms of Service in hopes of finding good promises that your data will be kept safe are pointless, because no company can legally make such promises.

The one exception to governments getting away with this kind of behavior would be the story of Lavabit, where the founder chose to close the entire company overnight rather than comply with a nastygram from the NSA demanding the mail correspondence of Edward Snowden. But this is the exception to the rule. There is no scenario where a company keeps its promise and stays open, when a government says it wants the data in the custody of that company.

What’s The Important Thing, that is powerful enough to override all your deficiencies? [A Smart Bear: Startups and Marketing for Geeks]

Do you feel the crushing weight of the disadvantages facing every new company? No brand, no features, no customers, no money, no distribution, no search engine rankings, no efficient advertising, no incredible executive team, no NPS, no strategy.

How do the successful startups rise above all that? Do they solve all those problems at once, or at least quickly?

No.

I apologize in advance for using the dang iPhone as an example but… The iPhone is one of the most successful and important products of the past few decades. But the first version launched with a mountain of issues. It was a terrible phone, ironically. The whole idea was that it was a “smart phone,” yet everyone agreed their cheap-o Nokia flip-phone was ten times better at being a phone. Also, imagine launching an operating system that didn’t include “copy/paste.” Terrible!

But, the iPhone did something so well, that people wanted so badly, they would put up with all the other crap: You could actually use the internet. The real Internet with full websites and everything. The web actually worked (even if slowly). Email actually worked. In your pocket. It’s hard to explain the magic and excitement to a Gen-Z’er who takes it for granted. This was so compelling, all the other problems didn’t matter.

For more than ten years — an eon in tech-time — Heroku has been the dominate way that Rails developers launch public applications. When it first came out, it was rife with “deal-breakers” that developers continually winged about. “What do you mean I have to use Bundler — it’s broken half the time!”  “What do you mean I can’t change the filesystem at run-time — I’ll have to change my algorithms!”  “What do you mean it doesn’t support MySQL — everyone uses MySQL!  My queries are going to break.”  “Wow these websites are really slow.”  On and on with the complaints, and all quite valid.

But, Heroku did something so well, that people wanted to badly, they would put up with all the other crap. You could type git push production and your site went live. You could use a knob on a web page to determine how scalable the site was. (Don’t worry, that knob is also connected to your wallet.) You never saw a server. You never messed with backup. You never worried how to securely stash your API keys. You always had a staging area to test things in a real server environment before pushing code live. DevOps became a thing of the past for a large class of applications. This was a revolution so important, so compelling, all the other problems didn’t matter. Developers changed their workflows and their code around Heroku and “12-Factor” apps; Heroku did not change to suit developers.

This is a universal pattern I call the “Important Thing.” Successful products deliver on something so fantastic, so game-changing, so important to the customer, that it is sufficient to override the overwhelming deficiencies of the rest of the product and company. So great that people tell their friends or force their colleagues to use it too (defeating the lack of marketing). So great that they’ll use it even if support is slow and releases have bugs (defeating the lack of operational excellence). So great that they’re excited to support a promising new company instead of worried about creating a dependency on a wobbly new company (defeating the lack of brand).

The Important Thing isn’t always a feature or technology. Fogbugz was never the leading bug-tracking system, but I employed it for most of the 2000s because I was a huge fan of Joel Spolsky’s blog, so it felt good to use a product made by a company whose values and behaviors I respected and learned from. The same happened with Basecamp and the 37signals blog. (You might be tempted to say Basecamp was successful because the work-style espoused by 37signals leads to successful products, but the contrary evidence is that all of their many subsequent products — built with the same work-style, by the same people, and even with the same code base — where all dramatically less successful, to the point that all of them have now been discontinued, and the company has been renamed “Basecamp” to emphasize that only the first of those experiments was ultimately successful.)

Here’s how to apply this to your own business:

It’s easy to get overwhelmed by the myriad of inadequacies you undoubtedly have. It’s tempting to attack them all, but worrying about everything and attacking simultaneously on all fronts with no weapons just leads to burn-out, and does not result in a company that is excellent on any front. Fortunately, you don’t need to solve all those problems. You need to solve almost none of them.

Instead, you need to focus on the one thing (maybe two) which is your Important Thing. The thing where, if you’re extraordinarily good at it, customers will overlook everything else.

It could be a feature (e.g. disappearing messages with Snapchat), but you can look beyond features. It could be enabling a lifestyle (such as remote-work or with-kids-work). It could be your online reputation (e.g. Joel Spolsky for me). It could be that you’re solving a problem in an industry that others overlook; having “any solution, even with problems” is better than having no solution. It could be that your culture resonates with an audience (e.g. 37signals), maybe due to an informal voice in an otherwise formal market, or because you have a cause — a higher purpose — so that people aren’t just buying a shirt or some software but rather they are supporting a movement (e.g. Patagonia who cares so much about the environment that there’s a company policy that they will bail employees out of jail if they’re arrested for peaceful protest, and by the way one of the results is that they have only 4% annual employee turn-over).

You should select something that you want to obsess over for the next five to ten years, that gets your customers excited, and which you at least have a possibility of executing. And then do that. Maybe only that. If you let all the other fires burn, maybe you have a shot at actually being excellent at that Important Thing.

If you do, all those other disadvantages exist, but aren’t fatal. That’s all you can hope for, at the beginning. And all you need.


Capturing Luck with “or” instead of “and” [A Smart Bear: Startups and Marketing for Geeks]

I won a fake stock market competition in elementary school. 

I put all my money in a few penny stocks — where prices are less than a dollar, and because of their small denomination, their value (as a percentage) fluctuates wildly. Some days I had the worst portfolio, other days I had the best. The competition happened to end on an up-day.

This was an example of “high risk, high reward.” Like startups.

Startups need luck too, in finding advertisement channels that work, in the right mix of features and usability that triggers product/market fit, in cultivating a useful social media presence, on employee number one working out well, on a competitor not making a fatal move, on there being enough money in the market, on appropriate pricing, on market forces not shifting the rules of the game, and the list goes on. And that’s after the luck of where and when you were born, the color of your skin, your gender, and that list goes on too. 

When you put it that way, it’s obvious why startups fail so frequently! They need a lot of success in a lot of areas, which is a lot of “good luck” to string together.

What can you do, to reduce this effect and maybe even turn luck to work in your favor?

The list above is a bunch of “ands.” That is, you need a good marketing channel and you need a few killer features and you need great initial employees and you need a healthy market, etc.. “And” is bad! It’s bad because each one has a probability of success, and you compute the total probability of success by multiplying them. No matter how optimistic you are about those probabilities, the end product is a small number.  Even 70% multiplied by itself five times is only 17%; most of those things don’t have as good of a chance as 70%. 

So the first question is: Can you reduce the number of things which have to go right? Can you convert some of those things into 100%?  For example, can you pick a large and growing market? Can you compete in a niche where incumbents don’t care or cannot move quickly? Can you hire someone you’ve worked with before, or build something sustainable without hiring?

Even so, there will be plenty of challenges, so we need a second technique for boosting probability: Leverage “or” instead of “and.”

Consider marketing channels. You could get your first few hundred customers through GoogleAds, or Facebook ads, or affiliate sales, or targeted outbound sales, or partnering with a high-profile reseller, or great press about your unique brand and message, or other ways. Only one of these needs to work! So although the probability of success for each one of these is low, the probability that something will work is higher.

This is true of everything from product features to website copy for conversions to avenues for exiting the company years from now. The general rule is optionality is strength.  When there are lots of ways for things to go right, that is a strong position even if you haven’t actualized one of those ways.

The converse of this is a business that has extra “and” clauses — even more than usual. Marketplaces, for example, almost never succeed. When they do succeed, they are often durable and profitable, which makes them a smart bet for a Venture Capitalist that can maintain a diversified portfolio of attempts, but for the individual business it’s a tough road. For example, a marketplace has to thrive both with the sellers and the buyers — if either one is disinterested, or is too expensive to corral, or doesn’t find value, or prefers to transact outside the marketplace, then the marketplace fails. Those are “ands!” Also, many marketplaces often only deliver value at scale; so another “and” is that they have to also “scale down” so the first 100 buyers and sellers also see value.

By accumulating “and” requirements, you are lowering the probability of success. By stringing together possible solutions with “or,” you are increasing the number of ways that luck could smile upon you. 

Set yourself up for luck!


Kung Fu [A Smart Bear: Startups and Marketing for Geeks]

Startup strategy is like Kung Fu. There are many styles that work. But in a bar fight, you’re going to get punched in the face regardless.

I can only teach you my style. Others can only teach you theirs.

This is my style.

“MVPs” are too M to be V. They’re a selfish ploy, tricking people who thought they were customers into being alpha testers. Build SLCs instead.

I don’t like freemium; I want to learn from people who care enough to pay, not from the 20x more who don’t. 

Founders almost never have a real strategy. They say things like “we have a unique feature” and “the incumbents are dumb,” which might be true, but isn’t a strategy. They don’t know how to analyze a market or competition, so they make it up instead learning how to do it. This is a common but largely unacknowledged reason why companies fail. Founders explain failures with things like “our two main competitors did [thing] to us” or “customers didn’t understand [our point of view].” The implication is that this was unknowable or bad luck, but the truth is, this was a predictable result of not understanding the market.

Power Laws are useful to understand and exploit. But don’t get hung up on it. The 10,000th biggest company in the world is a very successful company, as is the two-person company where the founders each take home $300k/year. 

All startups are screwed up. Even when they’re succeeding they are screwed up.  (HT Mike Maples Jr Corollary: A startup has to be so excellent at one or two key things, that they can screw up everything else up and not die. Sometimes that’s airtight product/market fit. Sometimes that’s defensible distribution channels. Sometimes that’s product design so thrilling that every customer spreads the word to five more. Sometimes that’s a market insight that takes competitors five years to understand. Sometimes that’s a dream team that weathers the storm that sinks the other boats. The bad news is, you don’t know ahead of time what that thing will be. The good news is, it’s OK that most things are screwed up.

Another Corollary: Your competitors are screwed up too. Don’t assume they’re smarter than you, faster than you, more strategic than you, growing faster than you, making better decisions than you. They are doing that sometimes, but you are too, and all of you are screwed up. When you look at them, you’re seeing their best, exaggerated projection, which isn’t the truth. Every time a company dies, read what they were writing a week earlier: proud, confident, optimistic, possibly even arrogant and boastful. Ignore all of it.

If you have more than three priorities, you have none.   (HT Tony Hsieh)

It’s better to complete 100% of 8 things than of 80% of 10 things.  (HT Dave Kellogg)

Too often, decisions are made “because a competitor is doing [something]” or “because a competitor might do [something].” Occasionally, that’s the right motivation. But usually, you should focus on what’s best for the customer and your company.

Before you pronounce one of your competitors “dumb,” consider that it’s much more likely that everyone working there are very likely of above-average intelligence. More useful is to ask: “If a thoughtful person were making that choice, why would that be?” You might learn something about their strategy or goals.

LTV is invalid until the company is more than five years old; even then it’s more noise than signal. Instead, watch payback period for acquisition efficiency, watch retention for product/market fit, watch expansion revenue for long-term growth, and watch gross margin for long-term profitability.

Churn needs to be lower than you think; 3% monthly churn is too high, and means you’re either not delivering recurring value, haven’t found product/market fit, the market stinks, or some other critical problem.

There isn’t one most important SaaS metric.  Priority depends on your goals (e.g. profitability versus size) and on what, at this moment, is so out of whack that ignoring it is fatal. 

You’re not allocating enough costs to gross margin or the cost to acquire a customer. You think this doesn’t matter because you’re not a public company or didn’t raise money, but it does matter because it means you don’t understand the financial mechanics of your business.

Fermi estimation is a good way to figure out whether a startup or product could even theoretically be viable. Beware: people tend to round up to the better power of ten because they don’t want to face the truth. Real life usually rounds down.

Being focussed on SaaS metrics is not incompatible with valuing employee fulfillment and customer happiness.  In fact, the latter two is crucial for producing the former.

A lot of businesses aren’t profitable even at scale. Founders assume “we can fix that later.” Often, you can’t.

I know you got profitable in three months, but you didn’t.

Telling the truth to customers and employees, especially when it’s difficult, is how you earn trust and loyalty. 

The best business model in 2019 is “doing good.” It attracts and retains talent even when salaries are low and risk of failure is high. It attracts customers even when prices are high and quality is low. It’s also the way to marry capitalism with morality.

Your values are tested only when the decisions are tough, like losing money, hurting your brand, firing a highly-productive employee who isn’t a culture-fit, or a wonderful culture-fit who isn’t productive. Your values are defined by what you tolerate. Your culture, and even your purpose, is the outcome of living your values.

Find and focus on one reliable distribution mechanism before diluting your time diversifying. If you can’t find one sizable sustainable source of customers, the solution won’t be found by piecing together three small unsustainable ones.

The shiny new fad is hyper-competitive, temporary, and money pouring in means crappy companies can mess about for years, poisoning the market. The “boring” but established, large market is where revenue is easy and competition is old, slow, and has something to lose

The only time you need “truly unique” tech and “an impossible-to-cross moat” is when your goal is to build a $1B+ company, which almost no one is (or should).

You can start by selling to small customers and evolve to larger ones, because you’re starting with a low cost-basis and then maturing your product and service. Or you can start selling to enterprise — nothing wrong with that — but then your high cost-basis of marketing, sales, and service will not scale downward.

Selling to the mid-market is hard. If you do it, expand into it later, after you’ve already mastered a different segment.

If you think Biz Dev is the solution to a difficult problem, you’re wrong. If you think it’s a way to add incremental value, then maybe, but still unlikely until you’re at $50M+ in ARR.

Great products don’t sell themselves, but they can cause customers to talk about it to other customers. You still have to get the first customers, and most of the rest, yourself.

Most “customer discovery” is really a founder in love with a product, trying to justify to herself that there is a market. She does this by selling for forty-five minutes instead investigating and disproving. And by talking about cool features instead of asking about price. And by remembering confirmatory evidence while conveniently forgetting or explaining away the rest, even though “the rest” is where the real learnings reside.

Pricing determines everything else. Corollary: Price must be part of your initial customer discovery, not an afterthought after you “first make sure there’s a pain. “

It’s hard to get 1000 paying customers. It’s easy to get to 100 if you have a real product. Price so that 100-200 is enough for all the founders to work full-time. This means you have to charge $50-$500/mo, and make something of genuine value.

Most of the time, real pain points addressed by great products are not a business. Which is why founders confidently begin and are surprised when it fails. A business also requires that there are many potential customers, who realize they have the problem, who you can reach at a reasonable cost, and then convince to convert, at a profitable price, against existing market dynamics, and last for years. Early on, your job is to validate that there’s a business, not to validate that your idea is good or that a pain exists. 

“Sales” is not a dirty word.

A reliable paid acquisition channel results in a somewhat stable business. It’s boring and doesn’t make you famous and doesn’t play into the false but common narrative that SEO and viral content will launch your startup into the market with almost no money. So people run after the false narrative instead of the thing that’s most likely to work.

People don’t value their time. They will do crazy things to save $2. Don’t sell a product that saves time to people who don’t care about saving time. Businesses often don’t care just as much as consumers don’t care.

Corollary: Sell more value, not more time. Even businesses who can compute the ROI of saving time, will compute a much larger ROI for creating value.

Multitasking is bad.

Your time being 90% allocated is bad.

Bet on things which are true today, and will be even more true in five years; not on your guess at how the future will be different. (HT Jeff Bezos) You want to argue that the future is unknowable, but that’s just an excuse for not having a strategy.

The “long tail” can sound appealing, but it sure is easy to sell vanilla ice cream at the beach even when you’re right next to another ice cream stand.

Yes, marketplace businesses can be valuable and defensible. But their failure rate is much higher than product businesses, and they require copious venture funding. They have to work with 10 people in the system, not just when there’s 10M. 

If someone deploys a code change and brings down the entire system, the fault is with the brittleness of the system, not with the person.

Corollary: If a talented person at your company does something “dumb,” the next question is: What did you fail to do as a leader? Did that person not have the right information, or enough context, or were they worried about something, or what?

One-on-ones are never a waste of time. Agendas are optional and sometimes even counterproductive. 

If you’re the smartest one in the room, you’ve made a terrible mistake. Either you haven’t hired great talent, or you have but you’re disempowering them. This is the opposite of what a great leader does, and minimizes the success of the organization. Andrew Carnegie wrote for his own tombstone: “Here lies a man who knew how to bring into his service men better than he was himself.” Ignore the gendered language but heed the lesson.

Starting up is the act of doing as many jobs as possible so your company can survive. Scaling is the act of shedding as many jobs as possible so your company can survive.  [quoted from Aaron Levie]

If you believe someone with a title of XYZ isn’t useful, or important, you’ve never worked with greatness at that function. Maintain that attitude, if you want a blind spot in your organization forever.

If someone at your company has no way to grow into a new role, they will leave, as well they should. “New role” can mean sophistication, management, or a different job. “No way” can mean because they’re unable or because the role they want cannot exist.

Founders are caught by surprise by the scaling phase. If you haven’t operated at the executive level at a scaling startup, you don’t appreciate how different and difficult it is. There are not enough blogs or books about this phase; often leaders go underground. Founders arrogantly believe that the beginning is the hardest part, because it is hard. But many startups top out between $5m-$20m in revenue. That’s fine if you don’t wish to scale. But if you do, your arrogance prevents you from the necessary transformations. What got you to $20m is very different from what gets you to $100m. You need help, new sorts of employees with a different organization, and you’d better surround yourself with people with more experience and skills than you have. How will your ego cope with that?

If you can’t unplug for a month without adversely affecting the business, then you have a brittle business. If the business is young, this is inevitable. Otherwise, this is a failure of organizational development.

Management systems, whether about performance management, or leadership, or healthy teams, or product strategy, or one-on-one interactions, or meetings, or productivity — are like deciding between coding styles. They’re all pretty good, so just pick one system you can stick to, and maximize it.

Make important decisions by optimizing for the one or two most important things, not by satisfying a dozen constraints. Maximize opportunity rather than minimize down-side.

Leave money on the table. The customer should derive 10x more value than it costs them. You earn loyalty and future upgrades. Karma works.

If you can’t double your prices, you’re in a weak market position. Determine the causes and address them purposefully.

Your product must materially impact one of your customer’s top three priorities. Otherwise, they don’t have time to talk to you.  (HT Tom Tunguz)

A good strategy is to be the System of Record for something.  (HT Tom Tunguz)

It’s more powerful to be 10x better at one thing, then to shore up ten weaknesses.

If your sales and marketing expenses are high, that either means your marketing is extremely competitive, or that people don’t naturally believe they need your product. If the latter, they might be right. If the former, you need large differentiation — more than a feature or two. If you don’t solve this problem, the company is ultimately unsustainable.

No one will read the Important Text in your dialog box.

Design is important, yet many of the $1B+ SaaS public companies have poor design. So, other things are more important.

Operate on cash-basis. Analyze on GAAP-basis. Don’t cheat on GAAP — you’re only lying to yourself.

“I don’t have time” actually means “I don’t want to.” 

The only cause of Writer’s Block is high standards. Type garbage. Editing is 10x easier than writing.

Vitriol online usually comes from that person defending self image or impressing others. Either way, it’s about them, not you. Often there’s a constructive learning inside which you should take to heart, but discard the petulant packaging.

Your Impostor Syndrome is tiresome. Just stop already.

“Desire to seem clever, to be talked about, to be remembered after death, to get your own back on the grown-ups who snubbed you in childhood, etc., etc. It is humbug to pretend this is not a motive, and a strong one. Writers share this characteristic with scientists, artists, politicians, lawyers, soldiers, successful businessmen — in short, with the whole top crust of humanity.” — George Orwell, Why I Write

The desire to impress others drives behavior more than logical argument. Founders start companies to show everyone else that they’re better than those everyones give them credit for. Angel investors want a story and celebrity by association at cocktail parties. Making your parents proud is often a stronger force than solving for the best risk-adjusted return.

It’s cliché, but it really is the journey, not the destination.

A lot of it is luck. Nevertheless, your operating plan should assume it’s all under your control, and run experiments. It’s your ego that needs to remember that a lot of it is luck.

Once we gain power or success, we forget what it took to get there, and we can’t account for the role of luck. This impedes sympathy and the ability to deliver appropriate advice. 

“Everyone thinks of changing the world. No one thinks of changing themselves.”  —Leo Tolstoy. Founders have to be arrogant to believe they can create something new and better, but the great founders are not so arrogant that they don’t evolve. Bezos, Zuckerberg, Page, Hoffman, all have a history of personal transformation. This is more difficult than building products.

You can have two Big Things in your life, but not three.

There’s no point in creating another huge company with a crappy culture that no one really wants to work at. But there’s every reason to create a company that creates 100 good jobs that people love working at, and where they are valued, and can grow. 

The first 10 people will join because you’re a startup: They get excitement, influence, caché, unique experience, and a small shot at outsized remuneration. But why will the 500th person join? If you can’t answer that, you aren’t building a company worth joining, which means you won’t be hiring awesome talent, and it means you don’t have a purpose other than revenue. Fix that now.

Your top talent are volunteers. They can get another job across the street for more money, because they’re the same person you hired except with more experience. It takes a different attitude and toolbox to motivate, inspire, and lead a volunteer, rather than a servant.

Making a difference in the world means having an impact on people’s lives. Even just a few people. It does not mean tricking 1,000,000 people to stick their nose in your app. One person who loves coming to work instead of loathing what they do with 1/3rd of their time on Earth, is enough. One person who was given a chance to grow and progress instead of being stuck being under-paid and under-developed, and under-appreciated, is enough.

Some things in life have to be experienced to be understood. Kids, founding a startup, scaling a startup, selling a startup. Does this contradict the rest of this page? Life can be contradictory.

This is my Kung Fu style. If you dislike my style, you are not wrong. In fact, that’s good — the world needs many styles. 

All styles have strengths and weaknesses. 

Pick the style that fits you, and master it.


When “fits and starts” is the most efficient path [A Smart Bear: Startups and Marketing for Geeks]

You roll down the windows and wear a helmet when you take your car to the track. This does not make me less terrified of a fiery death.

The American Autocross champion was sitting in my passenger seat screaming at me to not let my foot off the pedal until I bounced off the RPM limiter. She was properly intense. I didn’t know what I was doing, but it’s fun to power through curves in a high-speed tenuously-controlled skid in my Mini Cooper S (plus Cooperworks).

The proper driving strategy for Autocross is bizarre. It is possibly the worst-case scenario for wear and tear on a car. The strategy solves for the spaghetti-like pattern of the track, which is composed mostly of turns and banks, so that the winner is the one who can best negotiate a complex path, rather than which car is fastest on a straight-away.

Image result for autocross track in san marcos texas(The actual track I was on)

Driving strategy hinges on this constraint: a car can accelerate quickest when it is not turning. “Accelerate” means getting faster or getting slower. Mashing the accelerator pedal or brake pedal while turning, results in a spin-out.

So, you do this strange thing where you aim the car at a particular point near the first section of the turn and accelerate as much as possible in a straight line, as if you’re going to fly off the course. As you near that point, you break with just as much vigor. Still without turning the wheel. Then you turn and (more slowly) accelerate at just the right pace such that you’re gently skidding, but still in control. Until you can see the next point on the next turn that you full-accelerate straight towards.

What point should you pick? That depends on the curve, but drivers will find what they believe to be optimal points, and will often put small orange cones there as visual guide, especially during practice runs.

autocross racing cones

While this results in a unnatural, jerky, discontinuous motion, it is also the fastest way through the course.

Intuitively, it seems like there would be some smooth, efficient, graceful path, but that’s the slower way. And the goal is speed, not grace. (Of course, there’s a certain grace in speed, but not for the passenger being thrown about the cabin.)

Companies in the scaling phase feel like this too.

In most ways you are moving faster than ever, particularly when you’re moving in a straight line. For example, a new incremental product will launch to your existing customer base, with immediate impact in the millions of dollars — something a small company will take years to accumulate.

But in other ways it is jerky, unnatural course-corrections. Teams reform with new people and new missions. Things that worked for seven years suddenly don’t work, and “tiger teams” assemble to fix it, while new people wonder why it was ever built this way in the first place and older people laugh knowing that in two years, future-new people will be saying the same thing about what the now-new people are building.

Nothing is exempt: teams, processes, products, sales motions, branding, interviewing, culture, office space, customer interactions, architecture, security, finance, governance, hiring, …

It is certainly difficult to jerk around. Some people can’t take the forces, and that’s understandable. No one said this would be easy.

But it’s also the fastest way around the track.

And, while difficult, there’s no feeling like it.


A Scorecard: Should a decision be fast, or slow? [A Smart Bear: Startups and Marketing for Geeks]

We all know that startups should make decisions quickly. Fast decisions leads to rapid action, which accelerates the loop of production and feedback, which is how you outpace and out-learn a competitor, even one that already has a lead.

But some decisions should not be made in haste, like a key executive hire, or how to price, or whether to raise money, or whether to invest millions of dollars in a new product line.

How do you know when your current decision should be made slowly: contemplative, collaborative, deliberate, data-driven, even agonizing?

I’ve made the following scorecard to figure out whether it’s wise to go slow:

  1. Can’t undo.  This is the classic one-/two-way door delineation. If you can’t easily undo the decision, it’s worth investing more effort into analyzing the likelihood of the upsides and risks.
  2. Huge effort. Some things take less time to implement than to estimate or to debate.  Remember that it might take two engineers a week to implement something, but a few debates and some research might itself involve an entire engineering and product for a week as well. This is one reason why small teams without process can produce results faster than larger teams with process. If the effort to implement the decision is smaller than the effort to make a decision, just knock it out. But if you’re deciding on a path that could take six months to measure results from, taking time up front to research is wise.
  3. No compelling event.  If the status quo isn’t that bad, there might not be a reason why a decision should be made quickly. Without time-pressure, it’s more justifiable to spend more time on the decision. Conversely, time-pressure means the more time you spend deciding, the less time you have for implementation and unanticipated problems, so you’re adding risk by dragging out the decision.
  4. Not accustomed to making these kinds of decisions. Online marketing teams are accustomed to throwing creative things at the wall, with new technology and platforms, because that’s the day-to-day reality of their job.  Because they’re good at it, they don’t waste time hang-wringing over whether or not to try an advertising campaign on the latest social media platform; they just do it. Conversely, most organizations have no experience with major decisions like pricing changes or acquisitions, and most founders have no idea how to hire a great executive, or how to decide whether to invest millions of dollars in a new product line as opposed to “just throwing something out there and iterating” as was the correct path at the start of the company. When the organization has never made this type of decision before, the decision is at great risk, and being more deliberate with research, data, debate, or even outside advice, is wise.
  5. Don’t know how to evaluate the options.  Even after generating the choices, does the team understand how best to analyze them? If the company’s strategy is clear and detailed, if relevant data is at hand, if it’s clear what your goals are, if the deciding team has confidence, then the decision could be easy and fast; if these things are absent, perhaps more deliberation is needed to clarify those things.
  6. Can’t measure incremental success.  After the decision is made and implementation begins, can you objectively tell whether things are going well? If yes, it is easy to course-correct, or even change the decision, in the presence of reality. But if progress will be invisible or subjective, such that you will sink person-years of time into the implementation before knowing how things are going, it’s worth spending more effort ahead of time gaining confidence in the path you’ve selected.
  7. Imperfect information. Buying a house is nerve-racking, mostly because it is likely the most expensive and difficult-to-undo purchase of your life, but also because you know so little about the goods. What does the seller know but isn’t telling you? What will you not discover until you’ve moved in, or a year later? Often it is impossible to get the data or research you need to make an objective decision. When this is the case, it is sometimes wise to spend extra time gathering whatever information you can, maybe investing in reports or experts (which is what you do with a house). Or you could look at it the opposite way: If it’s impossible to get objective data informing the decision, then don’t spend lots of time debating subjective points; just make the decision from experience and even gut-check, because we just said that’s all you have to go on anyway.
  8. Decision requires multiple teams who haven’t worked together before.  At WP Engine we’re extremely collaborative across teams. The benefit is that we work together for a common goal, taking care of the needs of support, sales, marketing, engineering, product, and even finance, rather than solving for one department’s goals at the expense of another. But this also can make decisions more difficult, because finding a good solution is complex, often requiring compromise or creativity which requires time to be realized. This effect is amplified if the teams (or team members) haven’t worked together before, and thus have less rapport, common language, and common experience. In that case, give the decision more time to breathe and develop, because really you’re giving people the time to build relationships and discover great solutions, and that in itself is a benefit to them and your organizational intelligence, which is a long-term benefit worth investing in.

Actually this isn’t a scorecard, because important decisions aren’t a Cosmo Quiz. Don’t use this as a rubric; don’t score it 1-5 and add it up with a spreadsheet.

Rather, this is a framework for thinking through what needs to be done. Honestly answer these questions, and by the time you’re through, you’ll have a good sense of whether a light touch, quick decision is fine (which should be the default answer!), or whether you’ve justified taking more time.

And, depending on which pieces are problematic, you’ll have a guide for what needs to be done next.

For example, if “Can’t undo” is a big problem, can you rethink the solution so that it can be undone, maybe by investing more time, or creating a disaster recovery plan of action, or splitting up the decision so that part of is is undoable?

Or for example, if “No compelling event” is a problem, maybe the best answer is to “not decide,” i.e. don’t spend time on this right now, since you don’t have to. Some people will be disappointed in the lack of a decision, but it’s better to honestly state that “we can’t figure out the answer right now” than to make a rash decision that does more harm than good, or to invest time in a decision that doesn’t need to be made, at the expense of work that does need to be done.

I hope this helps you make the right decisions, in the right way.


How repositioning a product allows you to 8x its price [A Smart Bear: Startups and Marketing for Geeks]

Pricing is often more about positioning and perceived value than it is about cost-analysis and unconvincing ROI calculators.

As a result, repositioning can allow you to charge many times more than you think. Here’s how.

You’ve created a marketing tool called DoubleDown that doubles the cost-efficiency of AdWords campaigns. You heard that right folks — as a marketer, you can generate the same impact, the same number of conversions, the same quality of sales leads, but with half your current ad-spend. Wonderful! Who doesn’t want higher ROI.

What can you charge for this tool? Clearly you can’t charge as much as the money the customer is saving on AdWords, otherwise the net result is no savings at all. Let’s say you can charge 25% of the savings and still find many willing customers.

Here’s what your sales pitch looks like to a specific customer who spends $40,000 per month on AdWords:

Halve Your Spend!

Great deal! The VP of Demand Gen will be able to boast to the CMO that she saved the company $15,000/mo even after paying for DoubleDown, and you’re raking in a cool $5,000/mo. Everyone’s happy!

Now let’s see why you can actually charge eight times as much money for the same product.

Marketers have a single paramount goal: Growth. Even indirect marketing like brand, events, and PR have the long-term goal of supporting growth. In the case of DoubleDown’s customers it’s direct: Growth through lead-generation through AdWords.

Growth is much more valuable than cost. To see why, consider the following two scenarios:

  1. CMO reports to the CEO: I was able to reduce costs 20% this year.  The CEO is happy. The CEO’s follow-up question is: How will we use those savings to grow faster?
  2. CMO reports to the CEO: I was able to increase growth by 20% this year, but it also cost us 20% more to achieve.  The CEO pumps her fists amongst peels of joyous laughter. The value of the company increases non-linearly. The additional revenue growth more than pays for the additional marketing cost that generated it. The CEO’s follow-up question is: How can we ensure this happens again next year?

It’s always 10x more valuable for a business to grow faster than it is for the business to save money.

This insight points us to an alternate pitch for DoubleDown. It’s not about spending less for the same amount of growth, it’s about spending more to create more growth.

In particular, using our example of the customer who currently spends $40,000/mo, suppose that customer is generating 200 quality sales leads per month from that spend. The sales pitch changes as follows:

You’re paying $200/lead right now, yielding 200 leads per month. Using DoubleDown, you can double the number of leads you’re generating, still at a cost of $200/lead:

Double Your Leads!

The key is this: The customer is willing to spend $40,000 to generate 200 leads, and therefore is happy to spend $80,000 to generate 400 leads. It doesn’t matter how much of that $80,000 is going to AdWords versus going to DoubleDown. The key is not to “save money on AdWords,” but rather to “generate more growth at a similar unit cost.”

In the “saves money” pitch, the value was $20,000, and the customer needed to keep 75% of that value-creation. Whereas in the “generate growth” pitch, the value is $40,000, and the customer is happy to pay 100% of that value-creation to a vendor. Both the amount of value created, and the percentage of value the customer is willing to pay, is a multiple higher for the “growth” pitch versus the “save money” pitch.

So the next time you want to formulate your product as a way to “save time” or “save money” or “be more efficient” …. DON’T!

Instead, figure out how your product creates value in the way your customer already measures value, and position your product as a way to accomplish that.


WP Engine passes $100M in revenue and secures $250M investment from Silver Lake [A Smart Bear: Startups and Marketing for Geeks]

People said there’s no money in hosting. WordPress is just a toy. After the success of Smart Bear, I should be setting my sights on something big, not this.

I’m sure people said similar things to Heather when she joined as our CEO.

The Silicon Valley-oriented technology press outlets don’t cover us because we’re not in San Francisco, even though we’re more successful than most of the startups they cover.

We’ve come a long way from switching this blog to WordPress in 2009, my systematic vetting of the business idea in 2009 (after needing it myself due to the success of this blog crashing my dedicated server every time I got on Hackernews), the “coming soon” pre-launch in April 2010, our Series A 3-minute pitch in 2011, our incredible CEO Heather Brunner joining in 2013, opening offices in San Francisco in 2012, San Antonio in 2014, London in 2015, Limerick in 2016, and Brisbane in 2017, the launch of the first Digital Experience Platform for WordPress this year, creating a truly incredible culture based on real values, managing scale, inventing new ways of building products, and even having this blog take a back seat to the demands of the business.

We just announced a few more things.

Late last year we passed $100M in annual recurring revenue. We’re less than 8 years old so you can do the math on growth rates and figure out that we’re on an elite trajectory. That revenue is thanks to 75,000 customers, earned through the hard work of 500 employees across six offices on three continents. Every day, 5% of the entire online world (roughly 3.5 billion people) visits a customer running on the WP Engine Digital Experience Platform.

This week we closed $250M in financing from Silver Lake, the premier technology private equity firm. The majority of the funds pay back our early investors who believed in us enough to trust us with their money.  Of course a nice chunk is primary capital, i.e. for the company balance sheet, to invest in growth initiatives, security and quality, and advancing our existing strategic priorities through acceleration and de-risking.

We have never been in a stronger position. We have never had the caliber of teams we do today, as evidenced by our award-winning 70+ NPS customer service, our historic-low cancellation rates, our security and uptime, our product and engineering initiatives, our global brand leadership, our customer acquisition through both marketing and sales, our hiring and employee experience teams, our finance, legal, and governance teams, our executive leadership, and perhaps most important of all, the strength of our culture which embodies excellence, service, transparency and inclusiveness. We remain steadfast in our commitment to continuing to increase all of the above.

And now, with Silver Lake’s investment and support, we can accelerate our growth investing even more into our strategic roadmap, and placing some new bets on ideas we’ve had but haven’t been able to find the space to explore.

I’ve always said nothing beats the high of getting that first customer to sign up.  It’s the heroin-hit that hooks the entrepreneur.  (The next sale isn’t quite as sweet.)

That’s still an accurate portrayal, but there are other moments that are even more thrilling. This moment in WP Engine’s saga is one of those.

Thank you to all our customers, who vote for us every month with their pocketbooks, but even more importantly, entrusting us with their brand and online success, which we treat as vitally as we do our own.

And a special thank you to Heather Brunner, our CEO for the past four and a half years. Long-time readers of this (11-year old!!) blog might automatically assume that all this success is due to my prescience and wisdom, but the truth is that although I’ll certainly take the credit for the initial construction and lift-off of the rocket, setting up an impressive and rare trajectory, the reason we are in the position we are today, with all the attributes listed above, is due to Heather’s leadership, strength, vision, and execution. Period, full-stop. And everyone else at WP Engine would tell you the same thing.

So to everyone at WP Engine, let me repeat the message from one year ago:

Look what we did!


Brittleness comes from “One Thing” [A Smart Bear: Startups and Marketing for Geeks]

We’re tired of hearing how small software companies usually fail.

The data show that the two most common causes are: (1) Product isn’t useful to enough people, and (2) Problems with the team.

But what about the companies that die even though they did sell some copies of software, and where the early team isn’t dysfunctional?

I don’t have data for that cohort (tell me if you do!), but informally I’ve observed the following things, which follow a pattern that can be identified and counteracted:

  1. The initial marketing channel quickly saturated, so growth stalled at a non-zero but unsustainably-low rate.
  2. The initial marketing channel was sustainable for a while, but got wiped away due to external forces.  Examples: large bidders tripled the cost per click, Google’s SEO algorithm changed, the event organizers changed the rules or stopped doing the event, the link-sharing site became irrelevant, the hot blog lost its traffic, the magazine running the ads finally failed.
  3. The product was built on a platform, and the platform changed. Examples: A popular app drops to zero downloads after Apple builds it into iOS; A Microsoft Office add-on drops to zero sales after Microsoft builds that feature into Office; A Twitter utility breaks when Twitter removes functionality from their public API.
  4. The company landed one big customer representing 80% of total revenue, but that customer canceled. It wasn’t a mistake to sign that customer — it funded the entire company. But sometimes you experience the adverse end of that risk.
  5. A key employee left the company, which caused the company to fail. Early on, a 10x person can mint the company but also could be irreplaceable. A suitable replacement is too rare; it takes too long to find someone, convince them to join for almost no salary, and get them up-to-speed and productive.

When a company has revenue but is susceptible to the fatal afflictions above, I call it “brittle.” It’s a real business, but it’s easy to break.

The pattern, which suggests a remedy, is: Brittleness manifests wherever there is “One Thing.”

A technological example makes this clear. Suppose I have a single server that runs my website. Any number of things can cause this server to break — a power failure, a network failure, a bad configuration change, too much traffic arriving at once, and so on. How do you make this situation less brittle?

Let’s take power failure. Power can fail if the power supply inside the server burns up (typically because the fan inside it failed), or the power strip or power cord fails (maybe a wetware failure like accidentally unplugging it), or the power source running to the power strip could fail. You can address all this by having a second copy of the components — a second power strip with a second cord plugged into a second power supply. This is, in fact, exactly what data centers do! In short: redundancy — having two things that do the same job, instead of just one thing. It’s twice as expensive, but it buys you robustness.

But what happens if the power fails between the main power system in the data center and the cabinet where the two power strips are? That’s another case of “one thing.” So you could have two cables running to every cabinet, from two identical power units. Again this is what advanced data centers do.

But what happens if the city power fails? Data centers have their own gas-powered generators. Which means they have to stock large amounts of gasoline. Gas-powered engines that are used infrequently have a tendency to stop working, so they have to test and maintain those units. Data centers often have multiple generators. Robustness purchased at large expense.

In modern clouds we go yet another step further, because the entire data center itself is “One Thing.” So you have additional servers in other physically-separate data centers that draw power and network connectivity from different vendors.

The data-center example is applicable to all of the causes of failure above.

 

“One marketing channel” is brittle, because if anything happens to it, that could be the end of an otherwise-healthy company. The solution is to layer on additional marketing channels, so that variation in any one of them is not fatal.

“One platform” is brittle, because if they forward-integrate (i.e. copy you) or just fail, that’s the end of the company. One solution is to be multi-platform (which social media management tools did, and which we did with cloud infrastructure providers at WP Engine); another solution is to only build on platforms where you have a high degree of confidence that the platform owners are committed to supporting their ecosystem by never directly competing with them, and in fact promoting them. (Salesforce is currently the best in the world at this.)

“One big customer” is brittle. One solution is a long-term contract with a serious breakup clause, as insurance that pays for you to bridge to more customers. Another is to prioritize accelerating sales until that customer represents a percentage of revenue that you can stomach. Also, up-front payments, so you have the cash-flow to invest in that growth right now. The typical attitude is, “We now have a large customer, so pour extra money into development to make sure we don’t lose it,” but the right attitude is to use a lot of that money to land other customers.

“One key employee” is brittle. Not only might they leave, but what if they just get sick or need to take a vacation? The usual refrain in the startup world is that none of these are options — everyone has to work 70+ hours/week and never falter. Talk about brittle!

Solving these things takes time and money. These aren’t quick-fix solutions. You can’t just hire three more fantastic developers to create a robust engineering team, and you can’t just snap your fingers and find three new efficient, productive marketing channels.

Therefore, the right attitude is to maintain clarity on these risks and ask which one is best to work towards right now. For example, it’s cheaper and easier to experiment with new marketing channels than it is to find, interview, convince, and manage a second software developer, and plus if you can get a second marketing channel online, that will generate revenue, which in turn means you can afford a second software developer. In this scenario, the best thing is to focus all your energy on getting a second marketing channel running.

As you scale, the size of the “chunks” that create brittleness also scale, which creates new One Things, thus new risks and new investments. For example, with $260M in 2016 revenue, still growing at a blistering 60%/year, with a thousand employees, Hubspot is not brittle in any of the ways outlined above. But they recognized in 2016 that they were a single-product company. That is a “One Thing.” If there were a sea-change in the market for inbound marketing software, that could be fatal to Hubspot. It also limits long-term growth as the market matures and saturates. The way out is redundancy — becoming a multi-product company, but not where one product is 95% of revenue. They attacked that problem, and today (Nov 2017) they’re well on their way, as recognized by the media at large.

Finally, on a personal note, there’s another “chunk-level” that’s even larger than all of the preceding, and it’s a brittleness that almost all founders suffer from, including myself. The chunk of “the entire company.”

This is a component of why founders are almost always sad and sometimes permanently depressed even after a successful sale of a company. This was your “One Thing” for years. This is your identity, your life. You don’t have hobbies or even good friends anymore. You might have sacrificed family or health. Talk about a “One Thing.” Your entire life is brittle.

The solution here is not to have two companies or two jobs. That’s burnout; a lack of singular focus creates worse outcomes.

Rather, the solution is to realize that there were things you did and loved before and there will be things you will do and will love after. This is a chapter in a book. Even if one chapter is sad or has an unexpected twist, there’s still the next chapter which you can look forward to, even if you don’t yet know how that story will unfold.

Robustness, not in many things simultaneously, but in things serially. That’s the wrong attitude for solving tactical problems at work, but it’s the right attitude for thinking about the arc of your life.

Back to today and the here-and-now. Go list all the “One Things” which make your business brittle. Only tackle one or two things at a time — you have to manage risk, not pretend that you can eliminate all of it at once. Be thoughtful, and build steadily away from brittleness.


You can have two Big Things, but not three [A Smart Bear: Startups and Marketing for Geeks]

Forget work/life integration for a minute. How much time do you have, regardless of partitioning?

From your 24-hour daily allotment, the 1950s-style break-down is 8 hours for work, 8 for home and commute, and 8 for sleep and ablutions. So, “work” and “home” are the two things in which you can spend 40+ hours per week.

This is the amount of time it takes to tackle something huge. A career. A parent. A startup.

There are weekends and vacations and sick days and such, but those don’t add up to enough concentrated time to carry off something like a startup without causing work or home to suffer.

Of course “work” and “home” are just placeholders for “Big Things.” If you’re unattached, “home” doesn’t occupy significant time.

The rule of life is: You can have two “Big Things” in your life, but not three.

Big Things include:

  • Job
  • Kids
  • Spouse
  • Social Life
  • Major Hobby   (e.g. build a boat in the garage, become a chess master, video game addiction)
  • Startup

You can do a startup on the side while you have a day job, but your family will never see you. You might even lose your family. It happens. This is partly why it’s easier to start a company before you have a family or even a spouse.

You can have a job and a social life, but unless your spouse is fully integrated and agreeable to that social life, there will be strife. “Going out with the guys again?

Yes, “kids” and “spouse” are on the list separately. Young kids strain marriages because there’s not enough time to invest in the kids as well as be there for each other.

Some people try to “have it all.” Men and women both. But it’s never true. At most two can function well; the rest do not. More often, there’s just one that receives the majority of the energy, and the rest suffers.

Note that “Sleep” isn’t on the list of options, even though it’s mathematically the same in terms of time commitment. That’s because cutting out sleep doesn’t work — then you can’t function at a high level at anything.

No, you are not an exception. That’s egotistical self-deception. Not on sleep, and not on the number of Big Things. Ask the people around you if they think you’re not failing at one of your Big Things.

Time to decide which two.


The fundamental lesson of the forces governing scaling startups [A Smart Bear: Startups and Marketing for Geeks]

 

Idealistic founders believe they will break the mold when they scale, and not turn into a “typical big company.” By which they mean: Without stupid rules that assume employees are dumb or evil, without everything taking ten times longer than it should, without wall-to-wall meetings, without resorting to hiring anything less than the top 1% of the talent pool, and so on.

That is, keeping the positive characteristics of a tiny organization, avoiding the common problems of a larger organization, by preserving their existing values and processes, just doing it with more people, and figuring it out as we go along, exactly as we always have.

Why do they never succeed? Why is this impossible when you have 500 employees? What are the fundamental forces that transform organizations at scale?

From Brittle to Robust

A “team of one” is the fastest, most efficient team, as measured by “output per person.”  Communication and decision-making occupy the minimum possible time. And maybe the person working on that thing is a “hero” — working extended hours and experienced with the problem space. Small companies operate this way by necessity, and it works!  It’s a big reason why they move quickly.

But, an illness takes the velocity of the product or quality of support from heroic to zero. And if that person leaves, you’ve just lost six months to hire and get back up to speed on that thing.  Or nine months because there weren’t any processes and documentation in place — again because it was just one person, who didn’t need that stuff, because after all we’re moving so quickly!

Or it’s fatal because that was a co-founder. “Founder trouble” is a leading cause of startup death (though data also show that companies with only one founder are more likely to fail, so the conclusion is just that startups are just always likely to fail!)

A team of one is brittle, but fast.  When you’re small, this is a good trade-off, because speed is critical for combating the things that are constantly about to kill the company.  When you’re large, and you might have 15-25% annual employee turnover, not to mention illness, vacation, and family, the same structure would sink you immediately.

So, no project can have fewer than, say, three people dedicated to it, plus people management and possibly some form of Product or Project Management. But that team of four will not be 4x more productive than the one-person team; per-person productivity goes down in exchange for robustness and continuity.

On the other hand, while the small company loses 9 months to the loss of a key employee, or even implodes, the big company is the steady turtle that adds thousands of customer per month like clockwork and wins the race.

Predictability

When you’re small there’s no need to predict when the feature will ship. Marketing isn’t scheduling a launch and recruiting isn’t timing the start-dates of the next 50 hires in customer service and sales. This means you can — and should! — optimize myopically for speed-to-market.

Small companies brag about their speed as an advantage, but it’s easy to see why the larger company actually has a massive advantage. Sure, when WP Engine launches a new product, the marketing department needs predictability for the launch date, but that’s because it’s a highly-skilled, well-funded group, which explodes with press, events, campaigns, social media, and newsletters, grabbing more attention in a single week than a smaller company might garner in a year. There’s also an armed globally-dispersed Sales and Support teams, so we’re selling to our 70,000 existing customers as well as thousands of new customers per month, which means we’ll end up adding more new revenue in one month than a small company will take in over a whole year.

The tradeoff, however, is predictability. We didn’t line up that press and have those sales materials and ensure code-quality high enough to scale on day one, without predictability. Predictability means going slower. Predictability requires more estimation (takes time), coordination (takes time), planning (takes time), documentation (takes time), and adjusting the plan when it inevitably unfolds differently from the prediction (takes time).

Predictability is also required for healthy team-growth. Consider the timeline of adding a technical support team member. First, Recruiting is casting about for potential candidates. Then scheduling and performing interviews. Then waiting for them to quit their job and take a week off. Then new-employee-orientation. Then classroom training. Then paired up with senior folks on the floor as they ramp up their skills and comfort. Then finally, after (say) four months, they’re up to speed.

Since that takes four months, we have to be able to predict the demand for technical support at least four months in advance, because we have to be hiring for that future demand right now. If we under-estimate, our support folks get overwhelmed with too much work, their quality of life suffers, and service to each customer suffers; if we over-estimate, we have too many people which is a cost penalty. Of course, the latter is a better failure mode than the former, but both are sub-optimal, and the solution is predictability.

“The future is inherently unpredictable,” insists the small company, spurred on by Lean and Agile mindsets. Indeed, blue-sky invention and execution are hard to predict. But this is also a self-fulfilling prophecy; to insist the future is unpredictable is to ignore the work that could make it more predictable, which of course makes it in fact unpredictable to that person.

Small companies don’t have the data, customers, institutional knowledge, expertise, and often the personal experience and skillset to predict the future, so they are usually correct in saying it’s impossible. But it’s not impossible in principle, it’s impossible for them. At scale, it becomes required. Not because Wall Street demands it, or investors demand it, or any other throw-away derogatory excuse made by unpredictable organizations, but because it’s critical for healthy scaling.

Materiality Threshold

If Google launches a new product that generates $10,000,000/year in revenue, is that good? No, it’s a colossal failure. They could have taken the tens of millions of dollars that the product cost to develop, and made their existing operation just 0.01% more effective, and made the same amount of money.

At nearly $100B/year in revenue, Google can only consider products which have the potential to generate $1B/year in revenue as an absolute floor, with the potential to grow to $10B/year if things go better than expected. Things like YouTube, Cloud, and self-driving cars.

This principle is called the “Materiality Threshold,” i.e. what is the minimum contribution a project must deliver for it to be material to the business.

With a small business, the materiality threshold is near $0. A new feature that helps you land just a few new customers this month is worth doing. A marketing campaign that adds two sign-ups/week is a success. Almost anything you do, counts. That’s easy, and it feels good to be moving forward. But it’s only easy because the bar is so low.

The financial success of the larger company dictates a non-trivial materiality threshold. This is difficult. Even a modest-sized company will need millions in revenue from new products, maybe tens of millions in the optimistic case. Very few products can generate that sort of revenue, whether invented by nimble, innovative startups or stately mature companies. As proof, consider that the vast majority of startups never reach a $10M/year run-rate, even with decent products and extraordinarily dedicated and capable teams.

Yet, it’s the job of a Product Manager at that mid-sized company to invent, discover, design, implement, and nurture those products — something that most entrepreneurs will never succeed at. Tough job!

Recruiting

Employee #2 will join a startup for the experience. Even at a significant salary cut, and even if the company fails — the most likely outcome. It’s worth it for the stories, the influence, the potential, the thrill, the control, the camaraderie, the cocktail-party-talk.

Employee #200 won’t join for those reasons. Employee #200 will have a different risk-profile regarding their life and career. Employee #200 will be interested in different sorts of problems to solve, like the ones listed in this document instead of the ones where you’re trying to understand why 7 people bought the software but the next 3 didn’t. Employee #200 will not work for a pay-cut.

Small companies could view this as an advantage, and certainly it’s advantageous to recruit amazing people at sub-market rates. But there are dozens if not hundreds of employees at WP Engine today who are much more skilled in their area of expertise than I’ve ever met at a small startup. Why? Because after developing that expertise, they find it’s only possible and enjoyable to apply their skills within a larger environment.

For example, there are advanced marketing techniques that would never make sense with a smaller company, that are fascinating, challenging, and impactful to the top line at a larger company. There are talented people who love that challenge and would hate going “back to the Kindergarten of marketing” scratching out an AdWords campaign with a $2000/mo budget or assembling the rudiments of SEO or just trying to get a single marketing channel to work or being called a “growth hacker” because they finagled a one-time bump in traffic.

But, this has implications around compensation, how you find that talent, and why that person wants to work at your company instead of the one down the block who can pay a little bit more. Therefore, it’s critical to have a mission that is genuinely important, have meaningful and interesting work to do, connect everyone’s work to something bigger than any of us. These matter even more at scale, because they’re the anchor and the primary reason why talent will join and stay.

 

Communication

With four people in a company, any information that needs to be shared can be told to just three other people. Everyone can know everything. If there’s a 5% chance of significant misunderstanding, that event doesn’t happen often.

With four hundred people, it’s never true that a piece of information can be reliably communicated, in a short period of time. A 5% chance of misunderstanding means twenty people are confused. In software terminology, communication challenges scale as O(n2).

“Slack” is not the answer. “Email” is not the answer. (Your emails are probably misinterpreted 40% of the time, by the way.) Repetition is the answer, in different formats, at different times, by many leaders, and even still it’s never 100%.

Technology & Infrastructure

Managing 10,000 virtual servers in the Cloud Era sounds easy. Automate everything, then any process that works for 100 servers, will work for 10,000 servers just by doing the same thing repeatedly — exactly the thing computers are excellent at.

It never works like that. Reddit took 18 months to get the “number of likes” to work at scale. StackOverflow took 4 years to get everything converted to HTTPS. Wired did that conversion in a “mere” 18 months. Everything is hard at scale.

What are the patterns in those stories?

One is that scale makes rare things common. Rare things are hard to predict and can be hard to prevent. Often they’re hard to even identify and sometimes impossible to reproduce. This is fundamentally difficult.

Another is continuity or compatibility with existing technology. New companies get to start from scratch, but at-scale companies must transform. New companies like to make fun of large companies for how hard it is to transform, neglecting that the cause of the difficulty might also be generating $100,000,000 in revenue.

Another is bottlenecking. All hardware and software systems have bottlenecks. At small scale, you don’t run into any bottlenecks, or at least the ones you do can be solved with simple techniques like increasing capacity. Eventually something difficult breaks and you have to rearchitect the stack to solve it. Even something simple like converting HTTP links to HTTPS or updating “number of likes” in real-time, becomes a monumental architectural challenge.

Not only does this slow down development, it adds investment. There will be entire teams who focus on infrastructure, scaling, deploys, cost-management, development processes, and so forth, none of which are directly visible to or driven by the customer, but which are necessary to manage the complexities of scale.

Risk-mitigation

For a small company, the most likely cause of death is suicide. Usually it’s starvation — can’t get enough customers (distribution) to pay enough money for long enough (product/market fit). But also things like founders splitting up, not getting enough traction to self-fund or to secure the next round of financing, having to go back to a day job, and so on.

At scale, the risks are completely different. There is very low risk that WP Engine will not sign up thousands of new customers this month. Other risks, however, are not only possible, but likely. Addressing those risks head-on, is required for a healthy and sustainable business that can last for many years.

Take the risk of business continuity during a disaster scenario. What if all availability zones of Amazon in Virginia are disabled for a week? How quickly could we get all our customers back up and running? Would that be true even though thousands of other businesses are also trying to spin up servers in other Amazon data centers at the same time? Could we communicate all this with our customers quickly and simply, so that our support team isn’t overwhelmed by repeating the same message to nearly a hundred thousand justifiably-worried customers?

Risk-mitigation can even result in growth. Serious customers want to see that their vendors understand and mitigate risk; this maturity becomes a selling point. That’s why enterprise suppliers are constantly flouting their compliance with SOC 2 and ISO 27001 and all the rest. Small companies make fun of those things as being unnecessary at best or a false sense of security at worst, but while they’re busy making that point, the larger companies are busy signing three-year multi-million dollar clients.

Early on, you do not need a disaster-recovery plan. That won’t be the thing that will kill the business, and your customers will understand if a young business is subject to that sort of risk. Later on, this becomes critical, and worth investing in.

The fundamental challenge of scaling: Embracing and implementing the shift from Small to Large

These forces cause larger companies to be fundamentally different than small ones. This isn’t a bad thing or a good thing. It’s a different thing.

Some idealistic founders believe the root cause of scaling issues is the “command-and-control” organizational structure. But none of the examples above make reference to any organizational structure. It’s universal. This is why Holacracy and Teal Organizations do not solve these problems in practice. It could be a fantastic idea to experiment with organizational structure, but the fundamental forces above will not be eliminated through recombination of roles and organization.

Scaling is hard, the road is foggy and bendy, it lasts for years, the set of people you need might be different, and no one emerges unscathed. So, it is not a sign of disaster if you have difficulty wrestling with these forces. Everyone does.

Disaster is when a company is scaling, but the leaders don’t appreciate these forces, don’t work constantly to morph the organization accordingly, don’t bring in experienced talent, decide they can figure it all out as they go along without help. Rather, it should mean new people, new roles, new values, new processes, new recruiting, new stories, new constraints, new opportunities.

Too many founders and leaders want to believe that “What got us here is what’s important and unique about us, and thus we should preserve all of it. Other companies fail because they ‘act like big companies,’ but we’ll avoid all that because we’re smarter than they were. As evidence of our acuity, just look at our success thus far. We will continue to succeed in the future as we have in the past.”

But they’re wrong.

There should be a few values that are kept constant, that’s true. Otherwise none of it means anything. But the details must change.

Many founders and leaders can’t make the shift. This always hurts the company, and sometimes kills the company. The world is full of those horror stories. It’s sad, because it’s an avoidable waste of opportunity and sometimes hundreds of person-years of effort.

Don’t become one of those cautionary tales.


The Cloud in 2021: Adoption Continues [Radar]

Last year, our report on cloud adoption showed that companies were moving quickly to adopt the cloud: 88% of our respondents from January 2020 said that they used the cloud, and about 25% said that their companies planned to move all of their applications to the cloud in the coming year.

This year, we wanted to see whether that trend continued, so we ran another online survey from July 26, 2021, to August 4, 2021. Since the 2020 survey was taken when the pandemic was looming but hadn’t yet taken hold, we were also curious about how the lockdown affected cloud adoption. The short answer is “not much”; we saw surprisingly few changes between January 2020 and July 2021. Cloud adoption was proceeding rapidly; that’s still the case.

Executive summary

  • Roughly 90% of the respondents indicated that their organizations are using the cloud. That’s a small increase over last year’s 88%.
  • The response to the survey was global; all continents (save Antarctica) were represented. Compared to last year, there was a much higher percentage of respondents from Europe (33%, as opposed to 11%) and a lower percentage from North America (42%).
  • In every industry, at least 75% of the respondents work for organizations using the cloud. The most proactive industries are retail & ecommerce, finance & banking, and software.
  • Amazon Web Services (AWS) (62%), Microsoft Azure (48%), and Google Cloud (33%) are still the big three, though Amazon’s market share has dropped slightly since last year (down from 67%). Most respondents use multiple cloud providers.
  • Industry to industry, we saw few differences in cloud providers, with two exceptions: Azure is used more heavily than AWS in the government and consulting & professional services sectors.
  • Two-thirds of respondents (67%) reported using a public cloud; 45% are using a private cloud; and 55% are using traditionally managed on-premises infrastructure.
  • Almost half (48%) said they plan to migrate 50% or more of their applications to the cloud in the coming year. 20% plan ti migrate all of their applications.
  • 47% said that their organizations are pursuing a cloud first strategy. 30% said that their organizations are already cloud native, and 37% said that they plan to be cloud native within three or more years. Only 5% are engaged in “repatriation” (bringing cloud applications back to on-premises infrastructure).
  • Among respondents who are using the cloud, the biggest concern is managing cost (30%). Compliance is a relatively minor concern (10%) and isn’t the most significant concern even in heavily regulated sectors such as finance & banking (15%), government (19%), and healthcare (19%).
  • When asked what skills organizations needed to succeed, respondents were divided fairly evenly, with “cloud-based security” (59%) and “general cloud knowledge” (54%) the most common responses.

Demographics: Who responded

The survey was sent to recipients of O’Reilly’s Programming and Infrastructure & Ops Newsletters, which together have 436,000 subscribers. 2,834 respondents completed the survey.

The respondents represent a relatively senior group. 36% have over 10 years’ experience in their current role, and almost half (49%) have over seven years’ experience. Newer developers were also well-represented. 23% have spent one to three years in their current position, and 8% have spent under one year.

Parsing job titles is always problematic, given that the same position can be expressed in many different ways. Nevertheless, the top five job titles were developer (4.9%1), software engineer (3.9%), CTO (3.0%), software developer (3.0%), and architect (2.3%). We were surprised by the number of respondents who had the title CTO or CEO. Nobody listed CDO or chief data officer as a title.

Aggregating terms like “software,” “developer,” “programmer,” and others lets us estimate that 36% of the respondents are programmers. 21% are architects or technical leads. 10% are C-suite executives or directors. 8% are managers. Only 7% are data professionals (analysts, data scientists, or statisticians), and only 6% are operations staff (DevOps practitioners, sysadmins, or site reliability engineers).

The respondents came from 128 different countries and were spread across all continents except for Antarctica. Most of the respondents were from North America (42%) and Europe (33%). 13% were from Asia, though that almost certainly doesn’t reflect the extent of cloud computing in Asia.

In particular, there were few respondents from China: only 8, or about 0.3% of the total. South America, Oceania, and Africa were also represented, by 6%, 4%, and 2% of the respondents, respectively. These results are significantly different from last year’s. In 2020, two-thirds of the respondents were from North America, and only 11% were from Europe. The other continents showed little change. Last year, we noted that European organizations were reluctant to adopt the cloud. That’s clearly no longer true.

Cloud users are spread throughout the industrial spectrum. Respondents to our survey were clustered most strongly in the software industry (36%). The next largest group comprises those who replied “other” (13%), and they are indeed scattered through industries from art to aviation (including some outliers like prophecy, which we never knew was an industry). Consulting & professional services (12%) was third; we suspect that many respondents in this group could equally well say they were in the software industry. Finance & banking (11%) was also well- represented. 5% of the respondents work in healthcare; another 5% were from higher education; 4% were in government; and a total of 4% work in electronics & hardware or computers (2% each). Surprisingly, only 3% of the respondents work in retail & ecommerce; we would have expected Amazon alone to account for that.

These results are very similar to the results from last year’s survey, with two major differences: this year, an even larger percentage of our respondents were from the software industry (23% in 2020), and a significantly larger group classified their industries as “other” (20%).

Survey respondents by industry

What does this mean? Less than it seems. We have to remind both ourselves and our readers that the number of respondents in any sector reflects, first, the size of that sector; second, our mailing lists’ penetration into that sector; and only third, cloud usage in that sector. The fact that 35% of the respondents are in the software industry while only 5% are in healthcare doesn’t by itself mean that the cloud has penetrated much more deeply into software. It means that the healthcare industry has fewer readers of our newsletters than does the software industry, hardly a surprising conclusion. To estimate cloud usage in any given sector, you have to look only at that sector’s data. What it says is that our conclusions about the software industry are based on roughly 1,000 respondents, while conclusions about the healthcare industry are only based on about 150 respondents, and are correspondingly less reliable.

The big picture

The big picture won’t surprise anyone. Almost all of the respondents work for organizations that are using cloud computing; only 10.3% answered a question asking why their organization doesn’t use cloud computing, implying that cloud usage is 89.7%. Likewise, when asked what cloud provider they’re using, 10.7% said “not applicable,” suggesting cloud usage of 89.3%. We can get a third fix on cloud usage by looking at a later question about cloud technologies. We asked whether respondents are using public clouds, private clouds, hybrid clouds, multiclouds, or traditionally managed
infrastructure. Respondents were allowed to select multiple answers, and most did. However, respondents whose organizations aren’t using any cloud technology would check “traditionally managed infrastructure.” Those respondents amounted to 7.5% of the total, suggesting 92.5% of the respondents are using the cloud in some form. Therefore, we can say with some confidence that the number of respondents whose organizations are using the cloud is somewhere between 89% and 93%.

These figures compare with 88% from our 2020 survey—a change that may well be insignificant. However, it’s worth asking what “insignificant” means: would we expect the number of “not using” responses to be near zero? On one hand, we’re surprised that there hasn’t been a larger change from year to year; on the other hand, when you’re already near 90%, gaining even a single percentage point is difficult. We can be somewhat (only somewhat) confident that there’s a genuine trend because we asked the same question three different ways and got similar results. An additional percentage point or two may be all we get, even if it doesn’t allow us to be as confident as we’d like.

Did the pandemic have an effect? It certainly didn’t slow cloud adoption. Cloud computing was an obvious solution when it became difficult or impossible to staff on-premises infrastructure. You could argue that the pandemic wasn’t much of an accelerant either, and it would be hard to disagree. Once again though, when you’re at 88%, gaining a percentage point (or two or three) is an achievement.

AWS, Azure, and Google Cloud: The big three and beyond

The big three in cloud computing are Amazon Web Services (AWS), Microsoft Azure, and Google Cloud, used by 62%, 48%, and 33% of the respondents, respectively. (Because many organizations use multiple providers, respondents were allowed to select more than one option.) Oracle Cloud (6%), IBM Cloud (5%), and Alibaba Cloud (2%) took fourth through sixth place. They have a long way to go to catch up to the leaders,
although Oracle seems to have surpassed IBM. It’s also worth noting that, although Alibaba’s 2% seems weak, we expect Alibaba to be strongest in China, where we had very few respondents. Better visibility into Chinese industry might change the picture dramatically.

9% of the respondents selected “other” as their cloud provider. The leading “other” provider was Digital Ocean (1.4%), which almost edged out Alibaba. Salesforce, Rackspace, SAP, and VMware also appeared among the “others,” along with the Asian provider Tencent. Many of these “other” providers are software-as-a-service companies that don’t provide the kind of infrastructure services on which the big three have built their businesses. Finally, 11% of the respondents answered “not applicable.” These are presumably respondents whose organizations aren’t using the cloud.

Compared to last year, AWS appears to have lost some market share, going from 67% in 2020 to 62%. Microsoft Azure and Google Cloud remain unchanged.

Percent of respondents using each of the major cloud providers

Cloud usage by industry

One goal of our survey was to determine how cloud usage varies from industry to industry. We felt that the best way to answer that question was to go at it in reverse, by looking at the respondents who answered the question “What best describes why your organization doesn’t use cloud computing?” (which we’ll discuss in more detail later). Our results provided other ways to answer this question—for example, by looking at “not applicable” responses to questions about cloud providers. All approaches yielded substantially the same answers.

We found that retail & ecommerce, media & entertainment, finance & banking, and software stand out as the industry sectors with the highest cloud use. Only 3.1% of the respondents from the retail & ecommerce sector answered this question, indicating that cloud usage was close to universal (96.9%). 5.1% of the respondents in media & entertainment, 7.2% of the respondents in finance & banking, and 7.5% of the respondents in software answered this question, suggesting 94.9%, 92.8%, and 92.5% cloud usage, respectively. Most industries (including healthcare and higher education) clustered around 10% of organizations that aren’t using the cloud, or 90% cloud usage. The most cloud-averse industries were electronics & hardware (with 25% indicating that they don’t use the cloud) and government (16% not using). But consider: 25% of respondents indicating that they don’t use the cloud implies that 75% of the respondents do. While we saw variation from industry to industry, cloud users are a solid majority everywhere.

Can we get beyond the numbers to the “why”? Perhaps not without a much more detailed survey, but we can make some guesses. Although we had few respondents from the retail & ecommerce sector, it’s important to note that this industry is where the cloud took off: AWS began when Amazon started selling “excess capacity” in its data centers. Jeff Bezos paved the way for this with his famous “API mandate” memo, which required all software to be built as collections of services. In media & entertainment, Netflix has been very public about its cloud strategy. The company relies on the cloud for all of its scalable computing and storage needs, an approach initially undertaken as a way of avoiding on-premises infrastructure as a single point of failure.

But history often counts for little in tech. What’s more important is that retail & ecommerce is a sector subject to huge fluctuations in load. Black Friday is approaching as we publish this; need we say more? If your ecommerce site slows to a crawl under heavy load, you lose sales. The cloud is an ideal solution to that problem. No CIO wants to build an on-premises data center that can handle 100x changes in load. The same is true for Netflix, though perhaps not to the same degree: a new movie release almost certainly creates a spike in traffic, possibly a huge one. And in the
past few years, movie studios, vendors like Amazon, and many others in the industry have realized that the future of movies lies in selling subscriptions to streaming services, not cinema tickets. Cloud technologies are ideal for streaming services.

Just about every software company, from startups to established vendors like Microsoft and Adobe, now offers “software as a service” (SaaS), an approach arguably pioneered by Salesforce. Whether or not subscription services are the future of software, most software companies are betting heavily on cloud offerings, and they’re building those offerings in the cloud.

Understanding why banks are moving to the cloud may be more difficult, but we think it comes down to focusing on your core competence. The finance & banking industry has historically been very conservative, with organizations going for decades without significant change to their business models or procedures. In the past decade, that stability has gone out the window. Financial service companies and banks are now offering online and mobile products, investment services, financial planning, and much more. The best way to service these new applications isn’t by building out legacy infrastructure designed to support legacy applications largely unchanged since the beginning of computing; it’s by moving to an infrastructure that can scale on demand and that can be quickly adapted to support new applications.

Cloud usage by industry

Our next step was to look at the cloud providers to determine what providers are used by each industry. Are some providers used more widely in certain industries than in others? When we looked at this question, we saw a familiar pattern: AWS is the most widely used, followed by Microsoft Azure and Google Cloud. AWS dominates media & entertainment (79%) and is the most commonly used provider in every sector except for consulting & professional services (58%, compared to Azure at 60%) and government (52%, compared to Azure at 59%).

In addition to government and consulting & professional services, Azure is widely used in finance & banking (55%). That shouldn’t be surprising given the historical prominence of Microsoft Office in this industry.

Google Cloud was third in every sector except for media & entertainment (35%), where it edged out Azure. It’s strongest in the consulting & professional services sector (41%) and the relatively small computers sector (40%) and weakest in government (16%), healthcare (25%), and finance & banking (29%).

Electronics & hardware had the greatest number of respondents who answered “not applicable” (28%). Although there were surprisingly few respondents from the retail & ecommerce sector, it had the fewest (3%) respondents who answered “not applicable.”

AWS’s, Microsoft Azure’s, and Google Cloud’s shares were closest to each other in the higher education sector (49%, 43%, and 39%, respectively).

Cloud provider usage by industry

Cloud usage by geography

We wondered whether the usage of different cloud providers varied by continent: are some cloud providers more popular on some continents than on others? By and large, the answer is no. AWS leads by a substantial margin on every continent, and Microsoft Azure and Google Cloud take second and third place, though their relative strengths vary. Google Cloud is significantly stronger in South America (49%) and Asia (40%) than on the other continents. Azure is strongest in Oceania (55%), Africa (51%), and Europe (50%).

Alibaba Cloud is a somewhat more common choice in Asia (5%) and Oceania (3%), but not enough to change the picture significantly. Remember, though, that we had few respondents from China, where we suspect that cloud adoption is significant and Alibaba is a strong contender.

Although the percentages are relatively small, it’s also worth noting that more respondents in Oceania are using “other” providers (13%), possibly because their relative geographic isolation makes local cloud providers more attractive, and that a large percentage of respondents in Europe answered “not applicable” (14%), indicating that cloud adoption may still be lagging somewhat.

Cloud vendor usage by continent:
North America Cloud vendor usage by continent:
Europe Cloud vendor usage by continent:
Asia Cloud vendor usage by continent:
South America Cloud vendor usage by continent:
Oceania Cloud vendor usage by continent:
Africa

While we’ll discuss multicloud in more detail later, it’s interesting that this diagram gives some hints about the extent of deployment on multiple clouds. Remember that respondents could and frequently did select multiple cloud providers, so the total percentage in any continent (always greater than 100%) is a very rough indicator of multiple cloud deployments. By that measure, Africa (totaling 142%) and Europe (158%) have the fewest multiple cloud deployments; Oceania (179%) has the most.

Public or not

We asked our respondents what cloud technologies they’re using. 67% (two-thirds) are using a public cloud provider, such as AWS, versus 61% in 2020. 45% are using a private cloud—private infrastructure (on-premises or perhaps hosted) that’s accessed using cloud APIs—which represents a 10% increase over 2020 (35%). And 55% are using traditionally managed on-premises infrastructure, as opposed to 49% last year.

Respondents could select multiple answers, and many did. It’s not surprising that so many organizations appear to be using on-prem infrastructure; we’re actually surprised that the number isn’t higher, since in any cloud transformation, the remnants of traditional infrastructure are necessarily the last thing to disappear. And most companies aren’t that far along in their transformations. Moving to the cloud may be an important goal, and cloud resources are probably already providing critical infrastructure. But eliminating all (or even most) traditional infrastructure is a very heavy lift.

That’s not the whole story. 29% of the respondents said they’re using a hybrid cloud (a significant drop from last year’s 39%), and 23% are using multicloud (roughly the same year over year). A multicloud strategy builds systems that run across multiple cloud providers. Hybrid cloud goes even farther, incorporating private cloud infrastructure (on-premises or hosted) running cloud APIs. When done correctly, multiclouds and hybrid clouds can provide continuity in the face of provider outages, the ability to use “the best tool for the job” on different application workloads (for example, leveraging Google Cloud’s AI facilities), and easier regulatory compliance (because sensitive workloads can stay on private systems).

Therefore, respondents who selected “hybrid cloud” should also have selected “public cloud” and “private cloud” (or, possibly, “traditional infrastructure”). Indeed, that’s what we saw. Only 11% of the respondents who selected “hybrid cloud” didn’t select any other types—and we’d bet that’s because they assumed that “hybrid” implied the others. 27% of those who selected “hybrid” selected all five types, and the remainder selected some combination of the five options. The same was true for respondents who selected “multicloud”: only 4% selected “multicloud” by itself. (“Multicloud” and “hybrid cloud” were frequently selected together.)

Cloud technology usage

We’re puzzled by the difference between 2020 and 2021. An increase in the use of public clouds, private clouds, and even traditional infrastructure makes sense: users have clearly become more comfortable mixing and matching infrastructure to suit their needs. We don’t expect the use of traditional infrastructure to disappear. But why did usage of hybrid clouds drop? We don’t have a good answer, except to note that many respondents (indeed, a third of the total) who didn’t select either “multicloud” or “hybrid cloud” still selected multiple infrastructure types. A combination of public cloud and traditional infrastructure was most common, followed by public cloud, private cloud, and traditional infrastructure. These mixtures indicate that many respondents are clearly using some form of multicloud or hybrid cloud, even if they aren’t including that in their responses.

We can’t ignore the slip in the percentage of respondents answering “hybrid cloud.” That may indicate some skepticism about this most ambitious cloud architecture. It could even be an artifact of the pandemic. We’ve already said that the pandemic gave many companies a good reason to move on-premises infrastructure to the cloud. But while the pandemic may have been a good time to start cloud transformations, it was arguably a poor time to start very ambitious projects. Can we imagine CTOs saying, “Yes, we’re moving to the cloud, but we’re keeping it as simple as possible” Definitely.

We also looked at what types of cloud technologies were attractive to different industries. Public clouds are most heavily used in retail & ecommerce (79%), media & entertainment (73%), and software (72%). Hybrid clouds are strongest in consulting & professional services (38%), possibly because consultants often play a role in integrating different cloud providers into a seamless whole.Private clouds are strongest in telecommunications (64%), which was the only sector in which private clouds led public clouds (60%).

Traditionally managed on-premises infrastructure is most widely used in government (72%). Other industries where the number of respondents using traditional infrastructure equaled or exceeded the number reporting any form of cloud infrastructure included higher education (61%), healthcare (61%), telecommunications (67%), computers (65%), and electronics & hardware (58%).

Cloud technology usage by industry

When asked about their organization’s cloud strategy, almost half of the respondents (47%) answered “cloud first,” meaning that wherever possible, new initiatives consider the cloud as the first option. Only 5% selected “cloud repatriation,” or bringing services that were previously moved to the cloud back in-house. 10% indicated a multicloud strategy, where they work with multiple public cloud providers; and 9% indicated that their strategy is to use software-as-a-service cloud providers where possible (e.g., specific applications from companies like Salesforce and SAP), thus minimizing the need to develop their own in-house cloud expertise. Since “Buy before build; only build software related to your core competence” is an important principle in any digital transformation, we expected to see a greater investment in software-as-a-service products. Perhaps this number really means that almost any company will need to build software around its core value proposition.

Cloud migration strategies

Our respondents approach cloud migration aggressively. Almost half (48%) said they plan to migrate 50% or more of their applications to the cloud in the coming year; the largest group (20%) said they plan to migrate 100% of their applications. (We wonder if, for many of these respondents, a migration was already in progress.) 16% said they plan to migrate 25% of their applications. 36% answered “not applicable,” which may mean that they aren’t using the cloud (though this would indicate much lower cloud usage than other questions do) or that the respondent’s organization has already moved its applications to the cloud. It’s probably a combination of both.

When asked specifically about cloud native development (building software to run in a scalable cloud environment, whether that cloud is public, private, or hybrid), there was an even split between those who have no plan to go cloud native, respondents representing businesses that are already 100% cloud native, and respondents who thought they would be cloud native at some point in the future. Each group was (very) roughly a third of the total. Looking in more detail at respondents who are in the process of going cloud native, only 6% expect to be cloud native within a year. The largest group (20%) said they’d be cloud native within three or more years, indicating a general direction, if not a specific goal.

Going cloud native

Does the 67% who are planning to be or are already cloud native conflict with the 47% who said that they’re pursuing a cloud first strategy? It’s jarring—“cloud native” is, if anything, a stronger statement than “cloud first.” Presumably anyone who works for an organization that’s already cloud native is also pursuing a cloud first strategy. Some of the gap disappears if we include respondents executing a multicloud strategy in the “cloud first” group, which brings the “cloud first” total to 57%. After all, “cloud native” as defined by Wikipedia explicitly includes hybrid clouds.

Perhaps more to the point: there’s a lot of latitude in how respondents might interpret slogans and buzzwords like “cloud native” and “cloud first.” Someone who says that their organization will be “cloud native” at some point in the future (whether it takes one year, two years, or three or more years) isn’t saying that there aren’t significant noncloud projects in progress, and three or more years hardly sets an ambitious goal. But regardless of how respondents may have understood these terms, it’s clear that a substantial majority are moving in a direction that places all of their workloads in the cloud.

Cost and other issues

Survey respondents consistently reported that cost is a concern. When asked about the most important initiatives in their organizations pertaining to public cloud adoption, 30% of all respondents said “managing cost.” Other important cloud projects include performance optimization (13%), modernizing applications (19%), automating compliance and governance (10%), and cloud migration itself (11%). Only 6% listed migrating to a multicloud strategy as an issue—surprising given the large number who said they’re pursuing hybrid cloud or multicloud strategies.

These results were similar across all industries. In almost every sector, managing cost was perceived as the most important cloud initiative. Government and finance & banking were outliers; in these sectors, modernizing applications was a greater concern than cost management. (23% of the respondents in the government sector listed modernization as the most important initiative, as did 21% of the respondents from finance & banking.)

Most important initiatives that organizations will be tackling

Among respondents who aren’t using cloud computing, 21% said that regulatory requirements require them to keep data on-premises; 19% said that cost is the most important factor; and 19% were concerned with the risk of migration. Relatively few were concerned about the availability of talent (6%, in sharp contrast to our 2021 Data/AI Salary Survey results), and 5% said vendor lock-in is a concern. Again, this aligns well with our results from 2020: keeping data on-premises was the most common reason for
not using cloud computing, and cost was the second, followed by migration risk.

Reasons organizations are not using cloud computing

Why is cost such a critical factor? It’s easy to get into cloud computing on a small scale: building some experimental apps in the cloud because you can rent time using a company credit card rather than going through IT for more resources. If successful, those applications become real software investments that you need to support. They start to require more resources, and as the scale grows, you find that your cloud provider is getting the “economies of scale,” not you. Cloud providers certainly know how to set pricing so that it’s easy to move in but difficult to move out.

So, yes, cost needs to be managed. And one way to manage cloud cost is to stay away. You’re not locked into a vendor if you don’t have a vendor. But that’s a simplistic answer. Good cost management needs to account for the true benefits of moving to the cloud, which usually don’t involve cost. By analogy, assembly lines didn’t minimize the cost of building a factory; they made factories more effective. The cloud’s ability to scale quickly to handle sudden changes in workload is worth a lot. Do your applications suddenly become sluggish if there’s a sudden spike in load, and does that cause you to lose sales? In 2021, “Please be patient; our systems are experiencing heavy load” tells customers to go elsewhere. Improved uptime is also worth a lot; cloud providers have multiple data centers and backup power that most businesses can’t afford. (At O’Reilly, we learned this firsthand during the California fires of 2019, which disabled our on-premises infrastructure. We’re now 100% in the cloud, and we’re sure other companies learned the same lesson.)

Regulatory requirements are a big concern for respondents who aren’t using cloud computing (21%). When we look at respondents as a whole, though, we see something different. In most industries, roughly 10% of the respondents listed regulatory compliance as the most important initiative. The most notable outliers were finance & banking (15%), government (19%), and healthcare (19%)—but compliance still wasn’t the biggest concern for these industries. Respondents from the higher education sector reported little concern about compliance (4.8%). Other industries that were relatively unconcerned about compliance included electronics & hardware and media & entertainment (both 3.8%). Although we’re surprised by the responses from higher education, on the whole, these observations make sense: compliance is a big issue for industries that are heavily regulated and less of an issue for industries that aren’t. However, it’s also important to observe that concern about compliance isn’t preventing heavily regulated industries from moving to the cloud. Again, regulatory compliance is a concern—but that concern is trumped by the need to provide new kinds of applications.

Skills

Although only 6% of the respondents who aren’t using cloud computing said that skill availability was an issue, we’re skeptical about that—if you’re not moving to the cloud, you don’t need cloud skills. We got a different answer when we asked all respondents what skills were needed to develop their cloud infrastructure. (For this question, respondents were allowed to choose multiple options.) The most common response was “cloud-based security” (59%; over half of the respondents), with “general cloud knowledge” second (54%). That’s a good sign. Organizations are finally waking up to the fact that security is essential, not something that’s added on if there’s time.

Skills needed to develop a cloud infrastructure

Perhaps the biggest thing to learn from this question, though, is that over 35% of the respondents selected all of the skills (except other”). Most of them were around 45%. Containers (46%), Kubernetes (44%), microservices (45%), compliance (38%), monitoring (51%), observability (40%), scaling (41%), and performance (44%) are all in this territory. Our respondents appear to be saying that they need everything. All the skills. There’s definitely a shortage in cloud expertise. In our recently published 2021 Data/AI Salary Survey report, we noted that cloud certifications were most associated with salary increases. That says a lot: there’s demand, and employers are willing to pay for talent.

Portability between clouds

Our final question looked forward to the next generation of cloud computing. We asked about the barriers to moving workloads freely between cloud deployment platforms: what it takes to move applications seamlessly from one cloud to another, to a private cloud, and even to traditional infrastructure. That’s really the goal of a hybrid cloud.

Application portability and security were the biggest concerns (both 24%). The need for portability is obvious. But what may lie behind concern over portability is the string of development platforms that have promised application portability, going back at least to Digital Equipment’s CORBA in 1991 (and possibly much earlier). Containers and container orchestration are themselves “write once, run anywhere” technologies. Web Assembly (Wasm) is the current trendy attempt to find this holy grail; we’ll find out in the coming years whether it suffices.

Security on one platform is hard enough, and writing software that’s secure across multiple execution environments is much more difficult. With the increasing number of high-profile attacks against large businesses, executives have a right to be concerned. At the same time, every security expert we’ve talked to has emphasized that the most important thing any organization can do is to pay attention to the basics: authentication, authorization, software update, backups, and other aspects of security hygiene. In the cloud, the tools and techniques used to implement those basics are different and arguably more complex, but the basics remain the same.

Other concerns all clustered around 10%: the most significant include data portability (12%), important and often overlooked; the cost of moving data out of one cloud provider into another (9%), a concern we saw elsewhere in this study; managing compliance and, more generally, managing workloads at scale across multiple platforms (both 8%); and visibility into application performance (7%).

Barriers to moving applications between clouds

Until next year

What did we learn? Cloud adoption continues, and it doesn’t appear to have been affected by the pandemic. Roughly 90% of our respondents work for organizations that are moving applications to the cloud. This percentage is only slightly larger than last year (88%) and may not even be significant. (But keep in mind that when you’re at 90%, any further gains come with great difficulty. In practice, 90% is about as close to “everybody” as you can get.)

We also believe that we’re only at the beginning of cloud adoption. Our audience is technically sophisticated, and they’re more likely to be cloud adopters. A large majority of the respondents are in the process of moving applications to the cloud, indicating that “cloud” is a work in progress. It’s clearly a situation in which the more you do, the more you see that can be done. The more workloads you move to the cloud, the more you realize that other workloads can move. More important, the more comfortable you are with the cloud, the more innovative you can be in pushing your digital transformation even further.

Concerns about compliance remain significant. Not surprisingly, those concerns are higher among respondents who aren’t using the cloud than among those who are. That’s natural—but looking at the rapid pace of cloud adoption in heavily regulated industries like finance & banking makes us think that “compliance” is more of an excuse for noncloud users than a real concern. Yes, compliance is an issue, but it’s an issue that many organizations are solving.

Managing costs is obviously an important concern, and unlike compliance, it’s a concern ranked more highly by cloud users than nonusers. That’s both normal and healthy. The common perception that cloud computing is inexpensive isn’t reality. Being able to allocate a few servers or high-powered compute engines with a credit card is certainly an inexpensive way to start a project, but those savings evaporate at enterprise scale. The cloud provider will reap the economies of scale. For a user, the cloud’s real advantages aren’t in cost savings but in flexibility, reliability, and scaling.

Finally, cloud skills are in demand across the board. General skills, specific expertise in areas like security, microservices, containers, and orchestration—all are needed. Whether or not there’s a shortage of cloud expertise in the job market, this is an excellent time to pursue training opportunities. Many organizations are dealing with the question of whether to hire or train their staff for new capabilities. When pursuing a cloud transformation, it makes eminent sense to rely on developers who already understand your business and your applications. Hiring new talent is always important, but giving your current staff new skills may be more productive in the long run. If your business is going to be a cloud business, then your software developers need to become cloud developers.


Footnote

  1. A brief note about precision. We’ve rounded percentages to the nearest 1%, except in some cases where the percentage is small (under 10%). With nearly 3,000 respondents, 0.1% only represents 3 users, and we’d argue that drawing conclusions based on a difference of a percentage point or two is misusing the data.

Radar trends to watch: December 2021 [Radar]

The last month had a few surprises. Three items about quantum computing—all of which appeared on the same day. You’d think they were coordinating with each other. And of course, everybody wants to build their own version of the metaverse. There are several takes on having avatar-filled meetings in virtual spaces. Unfortunately, this solves the wrong problem. The problem that needs solving isn’t making meetings better, it’s making them unnecessary.

AI, ML, and Robotics

  • A self-driving library?  This public library robot in Korea carries 100 books and drives around on sidewalks; people with library cards can check the books out.
  • Increasingly widespread skepticism over Artificial General Intelligence may be a harbinger of another AI Winter–or at least an AGI winter, since current AI techniques have found many homes in industry. We don’t have to worry about paperclips yet.
  • The US Department of Defense has issuedethical guidelines for the use of artificial intelligence by its contractors.
  • Facebook has built an AI model that can translate between 100 human languages in any direction without relying on data from English.  That model is now open source.
  • Israel’s Defense Force produced an AI-based (“deepfake”) video that animated photos of soldiers who died in the 1948 Arab-Israeli war.  What does the ability to modify and fake historical documents mean for our ability to imagine the past and understand history?
  • Self-supervised learning with models that are heavily pruned can be used to build speech recognition systems for languages with relatively small numbers of speakers.
  • A framework to contest and justify algorithmic decisions is an important part of AI accountability. It’s not possible to redress harms if a decision cannot be contested, And it’s not possible to contest decisions if a system is incapable of offering a justification.
  • Facebook will stop using facial recognition technology and is deleting its face database, although it is keeping the model trained on that database. Out of the other side of their mouth, they have said this announcement doesn’t apply to Meta, which will use this model to produce VR products.
  • An AI system to give ethical advice gives unethical advice. What’s concerning isn’t the bad advice so much as the naiveté of the research project. Without a huge step forward in natural language understanding, and in the ability to reason about the history of human thought, why would anyone expect an AI system to do more than parrot back bad ethical advice that it finds in bulk on the web? Stochastic parrots indeed.
  • If language models are going to be more than stochastic parrots, they need ways to represent knowledge. Are knowledge graphs the key? The question of knowledge representation begs the question of what knowledge is, and how clever fakes along with recalcitrant ideologies both challenge the meaning of “knowledge.”
  • Unimaginable instruments may not exist in the physical world, but can be created (and played) with AI.  These instruments sense and understand music, and attempt to respond to what the musicians are doing and assist them. (Too many of these instruments sound like they came from the sound track of bad sci fi movies, but maybe that’s just me.)

Programming

  • The Deadlock Empire is a tutorial disguised as a game in which participants solve concurrent programming challenges to avoid deadlocks. This is an important new approach to online learning.
  • Because Git by nature tracks what changes were made, and who made those changes, GitOps may have a significant and underappreciated role in compliance.
  • ARM has joined the foundation promoting the Rust programming language, along with Toyota and 14 other new industrial members.
  • Is cloud repatriation (moving from the cloud back to on-premises datacenters) happening?  On-premises infrastructure will never disappear; there will always be some data that’s too difficult or important to move. And there are no doubt cloud projects that don’t deliver, and move back on-prem. But we don’t see a big shift, nor do we see “edge” as a new kind of “on-prem.”

Web

  • Bringing back the browser wars:  In Windows 11, Microsoft has made it difficult to use a browser other than their Edge, and requires the Edge browser for certain functions that use the proprietary microsoft-edge:// protocol. They have blocked workarounds that allow other browsers to use this protocol.
  • Hydrogen is a new React-based web framework developed by Shopify, optimized for e-commerce.  It’s now in developer preview.
  • A bipartisan proposal in the US House of Representatives would require social media companies like Google and Facebook to offer users results that aren’t filtered by “algorithms.”

Virtual and Augmented Reality

  • Inhabitants of the Metaverse will face the problem of how to present themselves online: how to design appropriate avatars. This can lead to a new level of anxiety over physical appearance and presentation, particularly if the options presented are limited.
  • Niantic is also building a metaverse, based on its Lightship augmented reality development kit, which it has just opened to the public. Their take on the metaverse is that it’s bad for humans to stay indoors, cocooned in virtual worlds.
  • Microsoft will have its own Teams-based metaverse. It’s built on avatars, not presence, and is aimed at improving the experience of working from home.

Quantum Computing

  • A startup claims to have built a 256-Qubit quantum processor; they also have a roadmap to get to 1,000 Qubits in two years. They claim that their approach offers greater fidelity (accuracy) than traditional approaches.
  • IBM has built a 127-Qubit quantum processor, with a roadmap to get to 1,000 physical Qubits in two years.
  • IBM has claimed (without providing evidence) that they have achieved quantum supremacy by solving a problem that is unsolvable by classical computers. At this point, the reaction has been “interesting, but show us the data.”

Security and Privacy

  • Gmail adds confidential mode for encrypted email.  It’s not fully end-to-end encrypted (among other things, Google performs spam detection), but it’s by far the easiest approach to securing email out there.
  • Ransomware defense tips for small businesses from the US Federal Trade Commission: The first step is offline, encrypted backups.  The FTC also has a guide about how to respond to a ransomware attack.
  • Securing your digital life is an excellent four part series on personal security. (There may be more coming.)
  • A study (apparently in the UK) has reported that a third of the people working from home are subject to surveillance by their employer.
  • The international cyber surveillance industry is booming, and is becoming a serious international security issue.
  • Deception as a tool in defense against attacks: Traditional honeypots are old school.  Software development teams can build observable distributed systems that mimic real software, so that an attack can be safely monitored in detail, and developers can learn about vulnerabilities and techniques.
  • Attackers are stealing sensitive encrypted data and sitting on it, in hopes that when quantum computers are more widely available they can crack the encryption. That’s long term planning. This kind of hacking may be the purview of foreign states.
  • Most discussions of security focus on software. However, software is only part of the problem. Mitre has released a list of important hardware vulnerabilities.  Many of these arise from software embedded in the hardware–but regardless, programmers largely assume that the hardware on which their code runs isn’t vulnerable to attack.
  • Ransomware is targeting companies during mergers and acquisitions. It makes sense; that’s a period in which access to data is important and very time-sensitive.
  • Prossimo is a project of the Internet Security Research Group (ISRG) for discovering and fixing memory safety issues in Internet infrastructure code, and (more generally) to change the way programmers think about memory safety.
  • The Trojan Source vulnerability uses Unicode’s ability to handle bi-directional text to hide malware directly in the source code, where it is invisible. The code literally does not appear to say what it means.

Cryptocurrency

  • The ConstitutionDAO is a decentralized autonomous organization that attempted to buy one of the original copies of the US Constitution. It’s a fascinating attempt to create an organization that exists on the blockchain but owns a physical object. What’s most fascinating is the many layers of traditional trust that are required to make this decentralized trustless organization work.
  • NFTs could be about a lot more than “ownership” of a URL. Because they are programmable, they can include behavior, and have the potential to create new kinds of markets.

Biology

Internet of Things

  • A server problem at Tesla made it impossible for Tesla owners to start their car with their app. Why hasn’t Tesla learned from the problems other IoT vendors have experienced with smart locks and other devices? Smart devices that don’t work are really dumb.
  • Operating systems for the Internet of Things:  The Eclipse foundation has launched Oniro, an open source multikernel operating system for small devices, hoping that Oniro can unify a fragmented ecosystem. Unification will benefit security and interoperability between devices.
  • The US National Institute of Standards and Technology’s “lightweight cryptography” project attempts to find cryptographic algorithms that are appropriate for small devices. Most current cryptography is computationally very demanding, requiring (at least) a laptop, and isn’t appropriate for an embedded system.

Low-Code and the Democratization of Programming [Radar]

In the past decade, the growth in low-code and no-code solutions—promising that anyone can create simple computer programs using templates—has become a multi-billion dollar industry that touches everything from data and business analytics to application building and automation. As more companies look to integrate low-code and no-code solutions into their digital transformation plan, the question emerges again and again: what will happen to programming?

Programmers know their jobs won’t disappear with a broadscale low-code takeover (even low-code is built on code), but undeniably their roles as programmers will shift as more companies adopt low-code solutions. This report is for programmers and software development teams looking to navigate that shift and understand how low-code and no-code solutions will shape their approach to code and coding. It will be fundamental for anyone working in software development—and, indeed, anyone working in any business that is poised to become a digital business—to understand what low-code means, how it will transform their roles, what kinds of issues it creates, why it won’t work for everything, and what new kinds of programmers and programming will emerge as a result.

Everything Is Low-Code

Low-code: what does it even mean? “Low-code” sounds simple: less is more, right? But we’re not talking about modern architecture; we’re talking about telling a computer how to achieve some result. In that context, low-code quickly becomes a complex topic.

One way of looking at low-code starts with the spreadsheet, which has a pre-history that goes back to the 1960s—and, if we consider paper, even earlier. It’s a different, non-procedural, non-algorithmic approach to doing computation that has been wildly successful: is there anyone in finance who can’t use Excel? Excel has become table stakes. And spreadsheets have enabled a whole generation of businesspeople to use computers effectively—most of whom have never used any other programming language, and wouldn’t have wanted to learn a more “formal” programming language. So we could think about low-code as tools similar to Excel, tools that enable people to use computers effectively without learning a formal programming language.

Another way of looking at low-code is to take an even bigger step back, and look at the history of programming from the start. Python is low-code relative to C++; C and FORTRAN are low-code relative to assembler; assembler is low-code relative to machine language and toggling switches to insert binary instructions directly into the computer’s memory. In this sense, the history of programming is the history of low-code. It’s a history of democratization and reducing barriers to entry. (Although, in an ironic and unfortunate twist, many of the people who spent their careers plugging in patch cords, toggling in binary, and doing math on mechanical calculators were women, who were later forced out of the industry as those jobs became “professional.” Democratization is relative.) It may be surprising to say that Python is a low-code language, but it takes less work to accomplish something in Python than in C; rather than building everything from scratch, you’re relying on millions of lines of code in the Python runtime environment and its libraries.

In taking this bigger-picture, language-based approach to understanding low-code, we also have to take into account what the low-code language is being used for. Languages like Java and C++ are intended for large projects involving collaboration between teams of programmers. These are projects that can take years to develop, and run to millions of lines of code. A language like bash or Perl is designed for short programs that connect other utilities; bash and Perl scripts typically have a single author, and are frequently only a few lines long. (Perl is legendary for inscrutable one-liners.) Python is in the middle. It’s not great for large programs (though it has certainly been used for them); its sweet spot is programs that are a few hundred lines long. That position between big code and minimal code probably has a lot to do with its success. A successor to Python might require less code (and be a “lower code” language, if that’s meaningful); it would almost certainly have to do something better. For example, R (a domain-specific language for stats) may be a better language for doing heavy duty statistics, and we’ve been told many times that it’s easier to learn if you think like a statistician. But that’s where the trade-off becomes apparent. Although R has a web framework that allows you to build data-driven dashboards, you wouldn’t use R to build an e-commerce or an automated customer service agent; those are tasks for which Python is well suited.

Is it completely out of bounds to say that Python is a low-code language? Perhaps; but it certainly requires much less coding than the languages of the 1960s and ’70s. Like Excel, though not as successfully, Python has made it possible for people to work with computers who would never have learned C or C++. (The same claim could probably be made for BASIC, and certainly for Visual Basic.)

But this makes it possible for us to talk about an even more outlandish meaning of low-code. Configuration files for large computational systems, such as Kubernetes, can be extremely complex. But configuring a tool is almost always simpler than writing the tool yourself. Kelsey Hightower said that Kubernetes is the “sum of all the bash scripts and best practices that most system administrators would cobble together over time”; it’s just that many years of experience have taught us the limitations of endless scripting. Replacing a huge and tangled web of scripts with a few configuration files certainly sounds like low-code. (You could object that Kubernetes’ configuration language isn’t Turing complete, so it’s not a programming language. Be that way.) It enables operations staff who couldn’t write Kubernetes from scratch, regardless of the language, to create configurations that manage very complicated distributed systems in production. What’s the ratio—a few hundred lines of Kubernetes configuration, compared to a million lines of Go, the language Kubernetes was written in? Is that low-code? Configuration languages are rarely simple, but they’re always simpler than writing the program you’re configuring.

As examples go, Kubernetes isn’t all that unusual. It’s an example of a “domain-specific language” (DSL) constructed to solve a specific kind of problem. DSLs enable someone to get a task done without having to describe the whole process from scratch, in immense detail. If you look around, there’s no shortage of domain-specific languages. Ruby on Rails was originally described as a DSL. COBOL was a DSL before anyone really knew what a DSL was. And so are many mainstays of Unix history: awksed, and even the Unix shell (which is much simpler than using old IBM JCLs to run a program). They all make certain programming tasks simpler by relying on a lot of code that’s hidden in libraries, runtime environments, and even other programming languages. And they all sacrifice generality for ease of use in solving a specific kind of problem.

So, now that we’ve broadened the meaning of low-code to include just about everything, do we give up? For the purposes of this report, we’re probably best off looking at the narrowest and most likely implementation of low-code technology and limiting ourselves to the first, Excel-like meaning of “low-code”—but remembering that the history of programming is the history of enabling people to do more with less, enabling people to work with computers without requiring as much formal education, adding layer upon layer of abstraction so that humans don’t need to understand the 0s and the 1s. So Python is low-code. Kubernetes is low-code. And their successors will inevitably be even lower-code; a lower-code version of Kubernetes might well be built on top of the Kubernetes API. Mirantis has taken a step in that direction by building an Integrated Development Environment (IDE) for Kubernetes. Can we imagine a spreadsheet-like (or even graphical) interface to Kubernetes configuration? We certainly can, and we’re fine with putting Python to the side. We’re also fine with putting Kubernetes aside, as long as we remember that DSLs are an important part of the low-code picture: in Paul Ford’s words, tools to help users do whatever “makes the computer go.”

Excel (And Why It Works)

Excel deservedly comes up in any discussion of low-code programming. So it’s worth looking at what it does (and let’s willfully ignore Excel’s immediate ancestors, VisiCalc and Lotus). Why has Excel succeeded?

One important difference between spreadsheets and traditional programming languages is so obvious that it’s easily overlooked. Spreadsheets are “written” on a two-dimensional grid (Figure 1). Every other programming language in common use is a list of statements: a list of instructions that are executed more or less sequentially.

Figure 1. A Microsoft Excel grid (source: Python for Excel)

What’s a 2D grid useful for? Formatting, for one thing. It’s great for making tables. Many Excel files do that—and no more. There are no formulas, no equations, just text (including numbers) arranged into a grid and aligned properly. By itself, that is tremendously enabling.

Add the simplest of equations, and built-in understanding of numeric datatypes (including the all-important financial datatypes), and you have a powerful tool for building very simple applications: for example, a spreadsheet that sums a bunch of items and computes sales tax to do simple invoices. A spreadsheet that computes loan payments. A spreadsheet that estimates the profit or loss (P&L) on a project.

All of these could be written in Python, and we could argue that most of them could be written in Python with less code. However, in the real world, that’s not how they’re written. Formatting is a huge value, in and of itself. (Have you ever tried to make output columns line up in a “real” programming language? In most programming languages, numbers and texts are formatted using an arcane and non-intuitive syntax. It’s not pretty.) The ability to think without loops and a minimal amount of programming logic (Excel has a primitive IF statement) is important. Being able to structure the problem in two or three dimensions (you get a third dimension if you use multiple sheets) is useful, but most often, all you need to do is SUM a column.

If you do need a complete programming language, there’s always been Visual Basic—not part of Excel strictly speaking, but that distinction really isn’t meaningful. With the recent addition of LAMBDA functions, Excel is now a complete programming language in its own right. And Microsoft recently released Power Fx as an Excel-based low-code programming language; essentially, it’s Excel equations with something that looks like a web application replacing the 2D spreadsheet.

Making Excel a 2D language accomplished two things: it gave users the ability to format simple tables, which they really cared about; and it enabled them to think in columns and rows. That’s not sophisticated, but it’s very, very useful. Excel gave a new group of people the ability to use computers effectively. It’s been too long since we’ve used the phrase “become creative,” but that’s exactly what Excel did: it helped more people to become creative. It created a new generation of “citizen programmers” who never saw themselves as programmers—just more effective users.

That’s what we should expect of a low-code language. It isn’t about the amount of code. It’s about extending the ability to create to more people by changing paradigms (1D to 2D), eliminating hard parts (like formatting), and limiting what can be done to what most users need to do. This is democratizing.

UML

UML (Unified Modeling Language) was a visual language for describing the design of object oriented systems. UML was often misused by programmers who thought that UML diagrams somehow validated a design, but it gave us something that we didn’t have, and arguably needed: a common language for scribbling software architectures on blackboards and whiteboards. The architects who design buildings have a very detailed visual language for blueprints: one kind of line means a concrete wall, another wood, another wallboard, and so on. Programmers wanted to design software with a visual vocabulary that was equally rich.

It’s not surprising that vendors built products to compile UML diagrams into scaffolds of code in various programming languages. Some went further to add an “action language” that turned UML into a complete programming language in its own right. As a visual language, UML required different kinds of tools: diagram editors, rather than text editors like Emacs or vi (or Visual Studio). In modern software development processes, you’d also need the ability to check the UML diagrams themselves (not the generated code) into some kind of source management system; i.e., the important artifact is the diagram, not something generated from the diagram. But UML proved to be too complex and heavyweight. It tried to be everything to everybody: both a standard notation for high-level design and visual tool for building software. It’s still used, though it has fallen out of favor.

Did UML give anyone a new way of thinking about programming? We’re not convinced that it did, since programmers were already good at making diagrams on whiteboards. UML was of, by, and for engineers, from the start. It didn’t have any role in democratization. It reflected a desire to standardize notations for high-level design, rather than rethink it. Excel and other spreadsheets enabled more people to be creative with computers; UML didn’t.

LabVIEW

LabVIEW is a commercial system that’s widely used in industry—primarily in research & development—for data collection and automation. The high-school FIRST Robotics program depends heavily on it. The visual language that LabVIEW is built on is called G, and doesn’t have a textual representation. The dominant metaphor for G is a control panel or dashboard (or possibly an entire laboratory). Inputs are called “controls”; outputs are called “indicators.” Functions are “virtual instruments,” and are connected to each other by “wires.” G is a dataflow language, which means that functions run as soon as all their inputs are available; it is inherently parallel.

It’s easy to see how a non-programmer could create software with LabVIEW doing nothing more than connecting together virtual instruments, all of which come from a library. In that sense, it’s democratizing: it lets non-programmers create software visually, thinking only about where the data comes from and where it needs to go. And it lets hardware developers build abstraction layers on top of FPGAs and other low-level hardware that would otherwise have to be programmed in languages like Verilog or VHDL. At the same time, it is easy to underestimate the technical sophistication required to get a complex system working with LabVIEW. It is visual, but it isn’t necessarily simple. Just as in Fortran or Python, it’s possible to build complex libraries of functions (“virtual instruments”) to encapsulate standard tasks. And the fact that LabVIEW is visual doesn’t eliminate the need to understand, in depth, the task you’re trying to automate, and the hardware on which you’re automating it.

As a purely visual language, LabVIEW doesn’t play well with modern tools for source control, automated testing, and deployment. Still, it’s an important (and commercially successful) step away from the traditional programming paradigm. You won’t see lines of code anywhere, just wiring diagrams (Figure 2). Like Excel, LabVIEW provides a different way of thinking about programming. It’s still code, but it’s a different kind of code, code that looks more like circuit diagrams than punch cards.

Figure 2. An example of a LabVIEW schematic diagram (source: JKI)

Copilot

There has been a lot of research on using AI to generate code from human descriptions. GPT-3 has made that work more widely visible, but it’s been around for a while, and it’s ongoing. We’ve written about using AI as a partner in pair programming. While we were writing this report, Microsoft, OpenAI, and GitHub announced the first fruit of this research: Copilot, an AI tool that was trained on all the public code in GitHub’s codebase. Copilot makes suggestions while you write code, generating function bodies based on descriptive comments (Figure 3). Copilot turns programming on its head: rather than writing the code first, and adding comments as an afterthought, start by thinking carefully about the problem you want to solve and describing what the components need to do. (This inversion has some similarities to test-driven and behavior-driven development.)

Still, this approach begs the question: how much work is required to find a description that generates the right code? Could technology like this be used to build a real-world project, and if so, would that help to democratize programming? It’s a fair question. Programming languages are precise and unambiguous, while human languages are by nature imprecise and ambiguous. Will compiling human language into code require a significant body of rules to make it, essentially, a programming language in its own right? Possibly. But on the other hand, Copilot takes on the burden of remembering syntax details, getting function names right, and many other tasks that are fundamentally just memory exercises.

Figure 3. GitHub’s Copilot in action (source: Copilot)

Salvatore Sanfilippo (@antirez) touched on this in a Twitter thread, saying “Every task Copilot can do for you is a task that should NOT be part of modern programming.” Copilot doesn’t just free you from remembering syntax details, what functions are stashed in a library you rarely use, or how to implement some algorithm that you barely remember. It eliminates the boring drudgery of much of programming—and, let’s admit it, there’s a lot of that. It frees you to be more creative, letting you think more carefully about that task you’re doing, and how best to perform it. That’s liberating—and it extends programming to those who aren’t good at rote memory, but who are experts (“subject matter experts”) in solving particular problems.

Copilot is in its very early days; it’s called a “Technical Preview,” not even a beta. It’s certainly not problem-free. The code it generates is often incorrect (though you can ask it to create any number of alternatives, and one is likely to be correct). But it will almost certainly get better, and it will probably get better fast. When the code works, it’s often low-quality; as Jeremy Howard writes, language models reflect an average of how people use language, not great literature. Copilot is the same. But more importantly, as Howard says, most of a programmer’s work isn’t writing new code: it’s designing, debugging, and maintaining code. To use Copilot well, programmers will have to realize the trade-off: most of the work of programming won’t go away. You will need to understand, at a higher level, what you’re trying to do. For Sanfilippo, and for most good or great programmers, the interesting, challenging part of programming comes in that higher-level work, not in slinging curly braces.

By reducing the labor of writing code, allowing people to focus their effort on higher-level thought about what they want to do rather than on syntactic correctness, Copilot will certainly make creative computing possible for more people. And that’s democratization.

Glitch

Glitch, which has become a compelling platform for developing web applications, is another alternative. Glitch claims to return to the copy/paste model from the early days of web development, when you could “view source” for any web page, copy it, and make any changes you want. That model doesn’t eliminate code, but offers a different approach to understanding coding. It reduces the amount of code you write; this in itself is democratizing because it enables more people to accomplish things more quickly. Learning to program isn’t fun if you have to work for six months before you can build something you actually want. It gets you interacting with code that’s already written and working from the start (Figure 4); you don’t have to stare at a blank screen and invent all the technology you need for the features you want. And it’s completely portable: Glitch code is just HTML, CSS, and JavaScript stored in a GitHub archive. You can take that code, modify it, and deploy it anywhere; you’re not stuck with Glitch’s proprietary app. Anil Dash, Glitch’s CEO, calls this “Yes code”, affirming the importance of code. Great artists steal from each other, and so do the great coders; Glitch is a platform that facilitates stealing, in all the best ways.

Figure 4. Glitch’s prepopulated, comment-heavy React web application, which guides the user to using its code (source: Glitch)

Forms and Templates

Finally, many low-code platforms make heavy use of forms. This is particularly common among business intelligence (BI) platforms. You could certainly argue that filling in a form isn’t low-code at all, it’s just using a canned app; but think about what’s happening. The fields in the form are typically a template for filling in a complex SQL statement. A relational database executes that statement, and the results are formatted and displayed for the users. This is certainly democratizing: SQL expertise isn’t expected of most managers—or, for that matter, of most programmers. BI applications unquestionably allow people to do what they couldn’t do otherwise. (Anyone at O’Reilly can look up detailed sales data in O’Reilly’s BI system, even those of us who have never learned SQL or written programs in any language.) Painlessly formatting the results, including visualizations, is one of the qualities that made Excel revolutionary.

Similarly, low-code platforms for building mobile and web apps—such as Salesforce, Webflow, Honeycode, and Airtable—provide non-programmers with drag-and-drop solutions for creating everything from consumer-facing apps to internal workflows via templated approaches and purport to be customizable, but are ultimately finite based on the offerings and capabilities of each particular platform.

But do these templating approaches really allow a user to become creative? That may be the more important question. Templates arguably don’t. They allow the user to create one of a number (possibly a large number) of previously defined reports. But they rarely allow a user to create a new report without significant programming skills. In practice, regardless of how simple it may be to create a report, most users don’t go out of their way to create new reports. The problem isn’t that templating approaches are “ultimately finite”—that trade-off of limitations against ease comes with almost any low-code approach, and some template builders are extremely flexible. It’s that, unlike Excel, and unlike LabVIEW, and unlike Glitch, these tools don’t really offer new ways to think about problems.

It’s worth noting—in fact, it’s absolutely essential to note—that these low-code approaches rely on huge amounts of traditional code. Even LabVIEW—it may be completely visual, but LabVIEW and G were implemented in a traditional programming language. What they’re really doing is allowing people with minimal coding skills to make connections between libraries. They enable people to work by connecting things together, rather than building the things that are being connected. That will turn out to be very important, as we’ll start to examine next.

Rethinking the Programmer

Programmers have cast themselves as gurus and rockstars, or as artisans, and to a large extent resisted democratization. In the web space, that has been very explicit: people who use HTML and CSS, but not sophisticated JavaScript, are “not real programmers.” It’s almost as if the evolution of the web from a Glitch-like world of copy and paste towards complex web apps took place with the intention of forcing out the great unwashed, and creating an underclass of coding-disabled.

Low-code and no-code are about democratization, about extending the ability to be creative with computers and creating new citizen programmers. We’ve seen that it works in two ways: on the low end (as with Excel), it allows people with no formal programming background to perform computational tasks. Perhaps more significantly, Excel (and similar tools) allow a user to gradually work up the ladder to more complex tasks: from simple formatting to spreadsheets that do computation, to full-fledged programming.

Can we go further? Can we enable subject matter experts to build sophisticated applications without needing to communicate their understanding to a group of coders? At the Strata Data Conference in 2019, Jeremy Howard discussed an AI application for classifying burns. This deep-learning application was trained by a dermatologist—a subject matter expert—who had no knowledge of programming. All the major cloud providers have services for automating machine learning, and there’s an ever-increasing number of AutoML tools that aren’t tied to a specific provider. Eliminating the knowledge transfer between the SME and the programmer by letting SMEs build the application themselves is the shortest route to building better software.

On the high end, the intersection between AI and programming promises to make skilled programmers more productive by making suggestions, detecting bugs and vulnerabilities, and writing some of the boilerplate code itself. IBM is trying to use AI to automate translations between different programming languages; we’ve already mentioned Microsoft’s work on generating code from human-language descriptions of programming tasks, culminating with their Copilot project. This technology is still in the very early days, but it has the potential to change the nature of programming radically.

These changes suggest that there’s another way of thinking about programmers. Let’s borrow the distinction between “blue-” and “white”-collar workers. Blue-collar programmers connect things; white-collar programmers build the things to be connected. This is similar to the distinction between the person who installs or connects household appliances and the person who designs them. You wouldn’t want your plumber designing your toilet; but likewise, you wouldn’t want a toilet designer (who wears a black turtleneck and works in a fancy office building) to install the toilet they designed.

This model is hardly a threat to the industry as it’s currently institutionalized. We will always need people to connect things; that’s the bulk of what web developers do now, even those working with frameworks like React.js. In practice, there has been—and will continue to be—a lot of overlap between the “tool designer” and “tool user” roles. That won’t change. The essence of low-code is that it allows more people to connect things and become creative. We must never undervalue that creativity, but likewise, we have to understand that more people connecting things—managers, office workers, executives—doesn’t reduce the need for professional tools, any more than the 3D printers reduced the need for manufacturing engineers.

The more people who are capable of connecting things, the more things need to be connected. Programmers will be needed to build everything from web widgets to the high-level tools that let citizen programmers do their work. And many citizen programmers will see ways for tools to be improved or have ideas about new tools that will help them become more productive, and will start to design and build their own tools.

Rethinking Programmer Education

Once we make the distinction between blue- and white-collar programmers, we can talk about what kinds of education are appropriate for the two groups. A plumber goes to a trade school and serves an apprenticeship; a designer goes to college, and may serve an internship. How does this compare to the ways programmers are educated?

As complex as modern web frameworks like React.js may be (and we suspect they’re a very programmerly reaction against democratization), you don’t need a degree to become a competent web developer. The educational system is beginning to shift to take this into account. Boot camps (a format probably originating with Gregory Brown’s Ruby Mendicant University) are the programmer’s equivalent of trade schools. Many boot camps facilitate internships and initial jobs. Many students at boot camps already have degrees in a non-technical field, or in a technical field that’s not related to programming.

Computer science majors in colleges and universities provide the “designer” education, with a focus on theory and algorithms. Artificial intelligence is a subdiscipline that originated in academia, and is still driven by academic research. So are disciplines like bioinformatics, which straddles the boundaries between biology, medicine, and computer science. Programs like Data Carpentry and Software Carpentry (two of the three organizations that make up “The Carpentries”) cater specifically to graduate students who want to improve their data or programming skills.

This split matches a reality that we’ve always known. You’ve never needed a four-year computer science degree to get a programming job; you still don’t. There are many, many programmers who are self-taught, and some startup executives who never entered college (let alone finished it); as one programmer who left a senior position to found a successful startup once said in conversation, “I was making too much money building websites when I was in high school.” No doubt some of those who never entered college have made significant contributions in algorithms and theory.

Boot camps and four-year institutions both have weaknesses. Traditional colleges and universities pay little attention to the parts of the job that aren’t software development: teamwork, testing, agile processes, as well as areas of software development that are central to the industry now, such as cloud computing. Students need to learn how to use databases and operating systems effectively, not design them. Boot camps, on the other hand, range from the excellent to the mediocre. Many go deep on a particular framework, like Rails or React.js, but don’t give students a broader introduction to programming. Many engage in ethically questionable practices around payment (boot camps aren’t cheap) and job placement. Picking a good boot camp may be as difficult as choosing an undergraduate college.

To some extent, the weaknesses of boot camps and traditional colleges can be helped through apprenticeships and internships. However, even that requires care: many companies use the language of the “agile” and CI/CD, but have only renamed their old, ineffective processes. How can interns be placed in positions where they can learn modern programming practices, when the companies in which they’re placed don’t understand those practices? That’s a critical problem, because we expect that trained programmers will, in effect, be responsible for bringing these practices to the low-code programmers.

Why? The promise is that low-code allows people to become productive and creative with little or no formal education. We aren’t doing anyone a service by sneaking educational requirements in through the back door. “You don’t have to know how to program, but you do have to understand deployment and testing”—that misses the point. But that’s also essential, if we want software built by low-code developers to be reliable and deployable—and if software created by citizen programmers can’t be deployed, “democratization” is a fraud. That’s another place where professional software developers fit in. We will need people who can create and maintain the pipelines by which software is built, tested, archived, and deployed. Those tools already exist for traditional code-heavy languages; but new tools will be needed for low-code frameworks. And the programmers who create and maintain those tools will need to have experience with current software development practices. They will become the new teachers, teaching everything about computing that isn’t coding.

Education doesn’t stop there; good professionals are always learning. Acquiring new skills will be a part of both the blue-collar and white-collar programmer experience well beyond the pervasiveness of low-code.

Rethinking the Industry

If programmers change, so will the software industry. We see three changes. In the last 20 years, we’ve learned a lot about managing the software development process. That’s an intentionally vague phrase that includes everything from source management (which has a history that goes back to the 1970s) to continuous deployment pipelines. And we have to ask: if useful work is coming from low-code developers, how do we maintain that? What does GitHub for Excel, LabVIEW, or GPT-3 look like? When something inevitably breaks, what will debugging and testing look like when dealing with low-code programs? What does continuous delivery mean for applications written with SAP or PageMaker? Glitch, Copilot, and Microsoft’s Power Fx are the only low-code systems we’ve discussed that can answer this question right now. Glitch fits into CI/CD practice because it’s a system for writing less code, and copying more, so it’s compatible with our current tooling. Likewise, Copilot helps you write code in a traditional programming language that works well with CI/CD tools. Power Fx fits because it’s a traditional text-based language: Excel formulas without the spreadsheet. (It’s worth noting that Excel’s .xlsx files aren’t amenable to source control, nor do they have great tools for debugging and testing, which are a standard part of software development.) Extending fundamental software development practices like version control, automated testing, and continuous deployment to other low-code and no-code tools sounds like a job for programmers, and one that’s still on the to-do list.

Making tool designers and builders more effective will undoubtedly lead to new and better tools. That almost goes without saying. But we hope that if coders become more effective, they will spend more time thinking about the code they write: how it will be used, what problems are they trying to solve, what are the ethical questions these problems raise, and so on. This industry has no shortage of badly designed and ethically questionable products. Rather than rushing a product into release without considering its implications for security and safety, perhaps making software developers more effective will let them spend more time thinking about these issues up front, and during the process of software development.

Finally, an inevitable shift in team structure will occur across the industry, allowing programmers to focus on solving with code what low-code solutions can’t solve, and ensuring that what is solved through low-code solutions is carefully monitored and corrected. Just as spreadsheets can be buggy and an errant decimal or bad data point can sink businesses and economies, buggy low-code programs built by citizen programmers could just as easily cause significant headaches. Collaboration—not further division—between programmers and citizen programmers within a company will ensure that low-code solutions are productive, not disruptive as programming becomes further democratized. Rebuilding teams with this kind of collaboration and governance in mind could increase productivity for companies large and small—affording smaller companies who can’t afford specialization the ability to diversify their applications, and allowing larger companies to build more impactful and ethical software.

Rethinking Code Itself

Still, when we look at the world of low-code and no-code programming, we feel a nagging disappointment. We’ve made great strides in producing libraries that reduce the amount of code programmers need to write; but it’s still programming, and that’s a barrier in itself. We’ve seen limitations in other low-code or no-code approaches; they’re typically “no code until you need to write code.” That’s progress, but only progress of a sort. Many of us would rather program in Python than in PL/I or Fortran, but that’s a difference of quality, not of kind. Are there any ways to rethink programming at a fundamental level? Can we ever get beyond 80-character lines that, no matter how good our IDEs and refactoring tools might be, are really just virtual punch cards?

Here are a few ideas.

Brett Victor’s Dynamicland represents a complete rethinking of programming. It rejects the notion of programming with virtual objects on laptop screens; it’s built upon the idea of working with real-world objects, in groups, without the visible intermediation of computers. People “play” with objects on a tabletop; sensors detect and record what they’re doing with the objects. The way objects are arranged become the programs. It’s more like playing with Lego blocks (in real life, not some virtual world), or with paper and scissors, than the programming that we’ve become accustomed to. And the word “play” is important. Dynamicland is all about reenvisioning computing as play rather than work. It’s the most radical attempt at no-code programming that we’ve seen.

Dynamicland is a “50-year project.” At this point, we’re 6 years in: only at the beginning. Is it the future? We’ll see.

If you’ve followed quantum computing, you may have seen quantum circuit notation (shown in Figure 5), a way of writing quantum programs that looks sort of like music: a staff composed of lines representing qubits, with operations connecting those lines. We’re not going to discuss quantum programming; we find this notation suggestive for other reasons. Could it represent a different way to look at the programming enterprise? Kevlin Henney has talked about programming as managing space and time. Traditional programming languages are (somewhat) good about space; languages like C, C++, and Java require you to define datatypes and data structures. But we have few tools for managing time, and (unsurprisingly) it’s hard to write concurrent code. Music is all about time management. Think of a symphony and the 100 or so musicians as independent “threads” that have to stay synchronized—or think of a jazz band, where improvisation is central, but synchronization remains a must. Could a music-aware notation (such as Sonic Pi) lead to new ways for thinking about concurrency? And would such a notation be more approachable than virtual punch cards? This rethinking will inevitably fail if it tries too literally to replicate staves, note values, clefs and such; but it may be a way to free ourselves from thinking about business as usual.

Figure 5. Quantum circuit notation (source: Programming Quantum Computers)

Here’s an even more radical thought. At an early Biofabricate conference, a speaker from Microsoft was talking about tools for programming DNA. He said something mind-blowing: we often say that DNA is a “programming language,” but it has control structures that are unlike anything in our current programming languages. It’s not clear that those programming structures are representable in a text. Our present notion of computation—and, for that matter, of what’s “computable”—derives partly from the Turing machine (a thought experiment) and Von Neumann’s notion of how to build such a machine. But are there other kinds of machines? Quantum computing says so; DNA says so. What are the limits of our current understanding of computing, and what kinds of notation will it take to push beyond those limits?

Finally, programming has been dominated by English speakers, and programming languages are, with few exceptions, mangled variants of English. What would programming look like in other languages? There are programming languages in a number of non-English languages, including Arabic, Chinese, and Amharic. But the most interesting is the Cree# language, because it isn’t just an adaptation of a traditional programming language. Cree# tries to reenvision programming in terms of the indigenous American Cree culture, which revolves around storytelling. It’s a programming language for stories, built around the logic of stories. And as such, it’s a different way of looking at the world. That way of looking at the world might seem like an arcane curiosity (and currently Cree# is considered an “esoteric programming language”); but one of the biggest problems facing the artificial intelligence community is developing systems that can explain the reason for a decision. And explanation is ultimately about storytelling. Could Cree# provide better ways of thinking about algorithmic explainability?

Where We’ve Been and Where We’re Headed

Does a new way of programming increase the number of people who are able to be creative with computers? It has to; in “The Rise of the No Code Economy”, the authors write that relying on IT departments and professional programmers is unsustainable. We need to enable people who aren’t programmers to develop the software they need. We need to enable people to solve their own computational problems. That’s the only way “digital transformation” will happen.

We’ve talked about digital transformation for years, but relatively few companies have done it. One lesson to take from the COVID pandemic is that every business has to become an online business. When people can’t go into stores and restaurants, everything from the local pizza shop to the largest retailers needs to be online. When everyone is working at home, they are going to want tools to optimize their work time. Who is going to build all that software? There may not be enough programming talent to go around. There may not be enough of a budget to go around (think about small businesses that need to transact online). And there certainly won’t be the patience to wait for a project to work its way through an overworked IT department. Forget about yesterday’s arguments over whether everyone should learn to code. We are entering a business world in which almost everyone will need to code—and low-, no-, and yes-code frameworks are necessary to enable that. To enable businesses and their citizen programmers to be productive, we may see a proliferation of DSLs: domain-specific languages designed to solve specific problems. And those DSLs will inevitably evolve towards general purpose programming languages: they’ll need web frameworks, cloud capabilities, and more.

“Enterprise low-code” isn’t all there is to the story. We also have to consider what low-code means for professional programmers. Doing more with less? We can all get behind that. But for professional programmers, “doing more with less” won’t mean using a templating engine and a drag-and-drop interface builder to create simple database applications. These tools inevitably limit what’s possible—that’s precisely why they’re valuable. Professional programmers will be needed to do what the low-code users can’t. They build new tools, and make the connections between these tools and the old tools. Remember that the amount of “glue code” that connects things rises as the square of the number of things being connected, and that most of the work involved in gluing components together is data integration, not just managing formats. Anyone concerned about computing jobs drying up should stop worrying; low-code will inevitably create more work, rather than less.

There’s another side to this story, though: what will the future of programming look like? We’re still working with paradigms that haven’t changed much since the 1950s. As Kevlin Henney pointed out in conversation, most of the trendy new features in programming languages were actually invented in the 1970s: iterators, foreach loops, multiple assignment, coroutines, and many more. A surprising number of these go back to the CLU language from 1975. Will we continue to reinvent the past, and is that a bad thing? Are there fundamentally different ways to describe what we want a computer to do, and if so, where will those come from? We started with the idea that the history of programming was the history of “less code”: finding better abstractions, and building libraries to implement those abstractions—and that progress will certainly continue. It will certainly be aided by tools like Copilot, which will enable subject matter experts to develop software with less help from professional programmers. AI-based coding tools might not generate “less” code–but humans won’t be writing it. Instead, they’ll be thinking and analyzing the problems that they need to solve.

But what happens next? A tool like Copilot can handle a lot of the “grunt work” that’s part of programming, but it’s (so far) built on the same set of paradigms and abstractions. Python is still Python. Linked lists and trees are still linked lists and trees, and getting concurrency right is still difficult. Are the abstractions we inherited from the past 70 years adequate to a world dominated by artificial intelligence and massively distributed systems?

Probably not. Just as the two-dimensional grid of a spreadsheet allows people to think outside the box defined by lines of computer code, and just as the circuit diagrams of LabVIEW allow engineers to envision code as wiring diagrams, what will give us new ways to be creative? We’ve touched on a few: musical notation, genetics, and indigenous languages. Music is important because musical scores are all about synchronization at scale; genetics is important because of control structures that can’t be represented by our ancient IF and FOR statements; and indigenous languages help us to realize that human activity is fundamentally about stories. There are, no doubt, more. Is low-code the future—a “better abstraction”? We don’t know, but it will almost certainly enable different code.


We would like to thank the following people whose insight helped inform various aspects of this report: Daniel Bryant, Anil Dash, Paul Ford, Kevlin Henney, Danielle Jobe, and Adam Olshansky.

Remote Teams in ML/AI [Radar]

I’m well-versed in the ups and downs of remote work. I’ve been doing some form thereof for most of my career, and I’ve met plenty of people who have a similar story. When companies ask for my help in building their ML/AI teams, I often recommend that they consider remote hires. Sometimes I’ll even suggest that they build their data function as a fully-remote, distributed group. (I’ll oversimplify for brevity, using “remote team” and “distributed team” interchangeably. And I’ll treat both as umbrella terms that cover “remote-friendly” and “fully-distributed.”)

Remote hiring has plenty of benefits. As an employer, your talent pool spans the globe and you save a ton of money on office rent and insurance. For the people you hire, they get a near-zero commute and a Covid-free workplace.

Then again, even though you really should build a remote team, you also shouldn’t. Not just yet. You first want to think through one very important question:

Do I, as a leader, really want a remote team?

The Litmus Test

The key ingredient to successful remote work is, quite simply, whether company leadership wants it to work. Yes, it also requires policies, tooling, and re-thinking a lot of interactions. Not to mention, your HR team will need to double-check local laws wherever team members choose to live.  But before any of that, the people in charge have to actually want a remote team.

Here’s a quick test for the executives and hiring managers among you:

  • As the Covid-19 pandemic forced your team to work from home, did you insist on hiring only local candidates (so they could eventually work in the office)?
  • With wider vaccine rollouts and lower case counts, do you now require your team to spend some time in the office every week?
  • Do you see someone as “not really part of the team” or “less suitable for promotion” because they don’t come into the office?

If you’ve said yes to any of these, then you simply do not want a distributed team. You want an in-office team that you begrudgingly permit to work from home now and then. And as long as you don’t truly want one, any attempts to build and support one will not succeed.

If you’ve said yes to any of these, then you simply do not want a distributed team. You want an in-office team that you begrudgingly permit to work from home now and then. And as long as you don’t truly want one, any attempts to build and support one will not succeed.

And if you don’t want that, that’s fine. I’m not here to change your mind.

But if you do want to build a successful remote team, and you want some ideas on how to make it work, read on.

How You Say What You Have to Say

As a leader, most of your job involves communicating with people. This will require some adjustment in a distributed team environment.

A lot of you have developed a leadership style that’s optimized for everyone being in the same office space during working hours. That has cultivated poor, interruption-driven communication habits. It’s too easy to stop by someone’s office, pop over a cubicle wall, or bump into someone in the hallway and share some information with them.

With a remote team you’ll need to write these thoughts down instead. That also means deciding what you want to do before you even start writing, and then sticking with it after you’ve filed the request.

By communicating your thoughts in clear, unambiguous language, you’ve demonstrated your commitment to what you’re asking someone to do. You’re also leaving them a document they can refer to as they perform the task you’ve requested. This is key because, depending on work schedules, a person can’t just tap you on the shoulder to ask you to clarify a point.

(Side note: I’ve spent my career working with extremely busy people, and being one myself. That’s taught me a lot about how to communicate in written form. Short sentences, bullet points, and starting the message with the call-to-action—sometimes referred to as BLUF: Bottom Line Up-Front—will go a long way in making your e-mails clearer.)

The same holds true for meetings: the person who called the meeting should send an agenda ahead of time and follow up with recap notes. Attendees will be able to confirm their shared understanding of what is to be done and who is doing what.

Does this feel like a lot of documentation? That’s great. In my experience, what feels like over-communication for an in-office scenario is usually the right amount for a distributed team.

Embracing Remote for What It Is

Grammar rules differ by language. You won’t get very far speaking the words of a new language while using grammatical constructs from your native tongue. It takes time, practice, and patience to learn the new language so that you can truly express yourself.  The path takes you from “this is an unnatural and uncomfortable word order” to “German requires that I put the verb’s infinitive at the end of the clause.  That’s just how it works.”

There are parallels here to leading a distributed team. It’s too easy to assume that “remote work” is just “people re-creating the in-office experience, from their kitchen tables.” It will most certainly feel unnatural and uncomfortable if you hold that perspective.  And it should feel weird, since optimizing for remote work will require re-thinking a lot of the whats and hows of team interactions and success metrics.  You start winning when you determine where a distributed team works out better than the in-office alternative.

Remote work is people getting things done from a space that is not your central office, on time schedules that aren’t strict 9-to-5, and maybe even communicating in text-based chat systems.  Remote work is checking your messages in the morning, and seeing a stream of updates from your night-owl teammates.  Remote work is its own thing, and trying to shoe-horn it into the shape of an in-office setup means losing out on all of the benefits.

Embracing remote teams will require letting go of outdated in-office tropes to accept some uncomfortable truths. People will keep working when you’re not looking over their shoulder.  Some of them will work even better when they can do so in the peace and quiet of an environment they control.  They can be fully present in a meeting, even if they’ve turned off their video. They can most certainly be productive on a work schedule that doesn’t match yours, while wearing casual attire.

The old tropes were hardly valid to begin with. And now, 18 months after diving head-first into remote work, those tropes are officially dead. It’s up to you to learn new ways to evaluate team (and team member) productivity. More importantly, in true remote work fashion, you’ll have to step back and trust the team you’ve hired.

Exploring New Terrain

If distributed teamwork is new territory for your company, expect to stumble now and then. You’re walking through a new area and instead of following your trusty old map, you’re now creating the map. One step at a time, one stubbed toe at a time.

You’ll spend time defining new best practices that are specific to this environment. This will mean thinking through a lot more decisions than before—decisions that you used to be able to handle on autopilot—and as such you will find yourself saying “I don’t know” a lot more than you used to.

You’ll feel some of this friction when sorting out workplace norms.  What are “working hours,” if your team even has any?  Maybe all you need is a weekly group check-in, after which everyone heads in separate directions to focus on their work?  In that case, how will individuals specify their working hours and their off-time?  With so much asynchronous communication, there’s bound to be confusion around when a person is expected to pick up on an ongoing conversation in a chat channel, versus their name being @-mentioned, or contacting them by DM.  Setting those expectations will help the team shift into (the right kind of) autopilot, because they’ll know to not get frustrated when a person takes a few hours to catch up on a chat thread.  As a bonus, going through this exercise will sort out when you really need to hold a group meeting versus when you have to just make an announcement (e-mail) or pose a quick question (chat).

Security will be another source of friction.  When everyone is in the same physical office space, there’s little question of the “inside” versus the “outside” network.  But when your teammates are connecting to shared resources from home or a random cafe, how do you properly wall off the office from everything else? Mandating VPN usage is a start, but it’s hardly the entire picture.  There are also questions around company-issued devices having visibility into home-network traffic, and what they’re allowed to do with that information.  Or even a company laptop, hacked through the company network, infecting personal devices on the home LAN. Is your company’s work so sensitive that employees will require a separate, work-only internet service for their home office?  That would be fairly extreme—in my experience, I haven’t even seen banks go that far—but it’s not out of the realm of possibility.  At some point a CISO may rightfully determine that this is the best path.

Saying “I don’t know” is OK in all of these cases, so long as you follow that with “so let’s figure it out.” Be honest with your team to explain that you, as a group, may have to try a few rounds of something before it all settles. The only two sins here are to refuse to change course when it’s not working, and to revert to the old, familiar, in-office ways just to ease your cognitive burden. So long as you are thoughtful and intentional in your approach, you’ll succeed over the long run.

It’s Here to Stay

Your data scientists (and developers, and IT ops team) have long known that remote work is possible. They communicate through Slack and collaborate using shared documents. They see that their “datacenter” is a cloud infrastructure. They already know that a lot of their day-to-day interactions don’t require everyone being in the same office. Company leadership is usually the last to pick up on this, which is why they tend to show the most resistance.

If adaptive leadership is the key to success with distributed teams, then discipline is the key to that adaptation. You’ll need the discipline to plan your communication, to disable your office autopilot, and to trust your team more.

You must focus on what matters—defining what needs to get done, and letting people do it—and learn to let go of what doesn’t. That will be uncomfortable, yes. But your job as a leader is to clear the path for people who are doing the implementation work. What makes them comfortable trumps what makes you comfortable.

Not every company will accept this. Some are willing to trade the benefits of a distributed team for what they perceive to be a superior in-office experience. And that’s fine. But for those who want it, remote is here to stay.

Radar trends to watch: November 2021 [Radar]

While October’s news was dominated by Facebook’s (excuse me, Meta’s) continued problems (you’d think they’d get tired of the apology tour), the most interesting news comes from the AI world. I’m fascinated by the use of large language models to analyze the “speech” of whales, and to preserve endangered human languages. It’s also important that machine learning seems to have taken a step (pun somewhat intended) forward, with robots that teach themselves to walk by trial and error, and with robots that learn how to assemble themselves to perform specific tasks.

AI

  • The design studio Artefact has created a game to teach middle school students about algorithmic bias.
  • Researchers are building large natural language models, potentially the size of GPT-3, to decode the “speech” of whales.
  • A group at Berkeley has built a robot that uses reinforcement learning to teach itself to walk from scratch–i.e., through trial and error. They used two levels of simulation before loading the model into a physical robot.
  • AI is reinventing computers: AI is driving new kinds of CPUs, new “out of the box” form factors (doorbells, appliances), decision-making rather than traditional computation. The “computer” as the computational device we know may be on the way out.
  • Weird creatures: Unimals, or universal animals, are robots that can use AI to evolve their body shapes so they can solve problems more efficiently. Future generations of robotics might not be designed with fixed bodies, but have the capability to adapt their shape as needed.
  • Would a National AI Cloud be a subsidy to Google, Facebook, et.al., a threat to privacy, or a valuable academic research tool?
  • I’ve been skeptical about digital twins; they seem to be a technology looking for an application. However, Digital Twins (AI models of real-world systems, used for predicting their behavior) seem like a useful technology for optimizing the performance of large batteries.
  • Digital Twins could provide a way to predict supply chain problems and work around shortages. They could allow manufacturers to navigate a compromise between just-in-time stocking processes, which are vulnerable to shortages, and resilience.
  • Modulate is a startup currently testing real-time voice changing software. They provide realistic, human sounding voices that replace the user’s own voice. They are targeting gaming, but the software is useful in many situations where harassment is a risk.
  • Voice copying algorithms were able to fool both people and voice-enabled devices roughly 50% of the time (30% for Azure’s voice recognition service, 62% for Alexa). This is a new front in deep fakery.
  • Facebook AI Research has created a set of first-person (head-mounted camera) videos called Ego4D for training AI.  They want to build AI models that see the world “as a person sees it,” and be able to answer questions like “where did I leave my keys.” In essence, this means that they will need to collect literally everything that a subscriber does.  Although Facebook denies that they are thinking about commercial applications, there are obvious connections to Ray-Ban Stories and their interest in augmented reality.
  • DeepMind is working on a deep learning model that can emulate the output of any algorithm.  This is called Neuro Algorithmic Reasoning; it may be a step towards a “general AI.”
  • Microsoft and NVIDIA announce a 530 billion parameter natural language model named Megatron-Turing NLG 530B.  That’s bigger than GPT-3 (175B parameters).
  • Can machine learning be used to document endangered indigenous languages and aid in language reclamation?
  • Beethoven’s 10th symphony completed by AI: I’m not convinced that this is what Beethoven would have written, but this is better than other (human) attempts to complete the 10th that I’ve heard. It sounds like Beethoven, for the most part, though it quickly gets aimless.
  • I’m still fascinated by techniques to foil face recognition. Here’s a paper about an AI system that designs minimal, natural-looking makeup that reshapes the parts of the face that face recognition algorithms are most sensitive to, without substantially altering a person’s appearance.

Ethics

  • Thoughtworks’ Responsible Tech Playbook is a curated collection of tools and techniques to help organizations become more aware of bias and become more inclusive and transparent.

Programming

  • Kerla is a Linux-like operating system kernel written in Rust that can run most Linux executables. I doubt this will ever be integrated into Linux, but it’s yet another sign that Rust has joined the big time.
  • OSS Port is an open source tool that aims to help developers understand large codebases. It parses a project repository on GitHub and produces maps and tours of the codebase. It currently works with JavaScript, Go, Java, and Python, with Rust support promised soon.
  • Turing Complete is a game about computer science. That about says it…
  • wasmCloud is a runtime environment that can be used to build distributed systems with wasm in the cloud. WebAssembly was designed as a programming-language-neutral virtual machine for  browsers, but it increasingly looks like it will also find a home on the server side.
  • Adobe Photoshop now runs in the browser, using wasm and Emscripten (the C++ toolchain for wasm).  In addition to compiling C++ to wasm, Emscripten also translates POSIX system calls to web API calls and converts OpenGL to WebGL.
  • JQL (JSON Query Language) is a Rust-based language for querying JSON (what else?).

Security

  • Microsoft has launched an effort to train 250,000 cyber security workers in the US by 2025. This effort will work with community colleges. They estimate that it will only make up 50% of the shortfall in security talent.
  • Integrating zero trust security into the software development lifecycle is really the only way forward for companies who rely on systems that are secure and available.
  • A supply chain attack against a Node.js library (UA-Parser-JS) installs crypto miners and trojans for stealing passwords on Linux and Windows systems. The library’s normal function is to parse user agent strings, identifying the browser, operating system, and other parameters.
  • A cybercrime group has created penetration testing consultancies whose purpose is to acquire clients and then gather information and initiate ransomware attacks against those clients.
  • A federated cryptographic system will allow sharing of medical data without compromising patient privacy.  This is an essential element in “predictive, preventive, personalized, and participatory” medicine (aka P4).
  • The European Parliament has taken steps towards banning surveillance based on biometric data, private face recognition databases, and predictive policing.
  • Is it possible to reverse-engineer the data on which a model was trained? An attack against a fake face generator was able to identify the original faces in the training data. This has important implications for privacy and security, since it appears to generalize to other kinds of data.
  • Adversarial attacks against machine learning systems present a different set of challenges for cybersecurity. Models aren’t code, and have their own vulnerabilities and attack vectors. Atlas is a project to define the the machine learning threat landscape. Tools to harden machine learning models against attack include IBM’s Adversarial Robustness Toolbox and Microsoft’s Counterfit.
  • Researchers have discovered that you can encode malware into DNA that attacks sequencing software and gives the attacker control of the computer.  This attack hasn’t (yet) been found in the wild.
  • Masscan is a next generation, extremely fast port scanner.  It’s similar to nmap, but much faster; it claims to be able to scan the entire internet in 6 minutes.
  • ethr is an open source cross-platform network performance measurement tool developed by Microsoft in Go. Right now, it looks like the best network performance tool out there.
  • Self-aware systems monitor themselves constantly and are capable of detecting (and even repairing) attacks.

Infrastructure and Operations

  • Interesting insights into how site reliability engineering actually works at Google. SRE is intentionally a scarce resource; teams should solve their own problems. Their goal is to help dev teams attain reliability and performance objectives with engineering rather than brute force.

Devices and Things

  • Amazon is working on an Internet-enabled refrigerator that will keep track of what’s in it and notify you when you’re low on supplies.  (And there are already similar products on the market.) Remember when this was joke?
  • Consumer-facing AI: On one hand, “smart gadgets” present a lot of challenges and opportunities. On the other hand, it needs better deliverables than “smart” doorbells. Smart hearing aids that are field-upgradable as a subscription service?
  • A drone has been used to deliver a lung for organ transplant. This is only the second time a drone has been used to carry organs for transplantation.
  • Intel has released its next generation neuromorphic processor, Loihi. Neuromorphic processors are based on the structure of the brain, in which neurons asynchronously send each other signals. While they are still a research project, they appear to require much less power than traditional CPUs.

Web

  • ipleak and dnsleaktest are sites that tell you what information your browser leaks. They are useful tools if you’re interested in preserving privacy. The results can be scary.
  • Dark design is the practice of designing interfaces that manipulate users into doing things they might not want to do, whether that’s agreeing to give up information about their web usage or clicking to buy a product. Dark patterns are already common, and becoming increasingly prevalent.
  • Black Twitter has become the new “Green Book,” a virtual place for tips on dealing with a racist society. The original Green Book was a Jim Crow-era publication that told Black people where they could travel safely, which hotels would accept them, and where they were likely to become victims of racist violence.

Quantum Computing

  • A group at Duke University has made significant progress on error correcting quantum computing. They have created a “logical qubit” that can be read with a 99.4% probability of being correct. (Still well below what is needed for practical quantum computing.)
  • There are now two claims of quantum supremacy from Chinese quantum computing projects.

Miscellaneous

  • Would our response to the COVID pandemic been better if it was approached as an engineering problem, rather than scientific research?

The Sobering Truth About the Impact of Your Business Ideas [Radar]

The introduction of data science into the business world has contributed far more than recommendation algorithms; it has also taught us a lot about the efficacy with which we manage our businesses. Specifically, data science has introduced rigorous methods for measuring the outcomes of business ideas. These are the strategic ideas that we implement in order to achieve our business goals. For example, “We’ll lower prices to increase demand by 10%” and “we’ll implement a loyalty program to improve retention by 5%.” Many companies simply execute on their business ideas without measuring if they delivered the impact that was expected. But, science-based organizations are rigorously quantifying this impact and have learned some sobering lessons:

  1. The vast majority of business ideas fail to generate a positive impact.
  2. Most companies are unaware of this.
  3. It is unlikely that companies will increase the success rate for their business ideas.

These are lessons that could profoundly change how businesses operate. In what follows, we flesh out the three assertions above with the bulk of the content explaining why it may be difficult to improve the poor success rate for business ideas. Despite the challenges, we conclude with some recommendations for better managing your business.

(1) The vast majority of business ideas fail to generate positive results

To properly measure the outcomes of business ideas, companies are embracing experimentation (a.k.a. randomized controlled trials or A/B testing). The process is simple in concept. Before rolling out a business idea, you test; you try the idea out on a subset group of customers1 while another group—a control group—is not exposed to the new idea. When properly sampled, the two groups will exhibit the same attributes (demographics, geographics, etc.) and behaviors (purchase rates, life-time-value, etc.). Therefore, when the intervention is introduced—ie. the exposure to the new business idea—any changes in behavior can be causally attributed to the new business idea. This is the gold standard in scientific measurement used in clinical trials for medical research, biological studies, pharmaceutical trials, and now to test business ideas.

For the very first time in many business domains, experimentation reveals the causal impact of our business ideas. The results are humbling. They indicate that the vast majority of our business ideas fail to generate positive results. It’s not uncommon for 70-90% of ideas to either have no impact at all or actually move the metrics in the opposite direction of what was intended. Here are some statistics from a few notable companies that have disclosed their success rates publicly:

  • Microsoft declared that roughly one-third of their ideas yield negative results, one-third yield no results, and one-third yield positive results (Kohavi and Thomke, 2017).
  • Streaming service Netflix believes that 90% of its ideas are wrong (Moran, 2007).
  • Google reported that as much as 96.1% of their ideas fail to generate positive results (Thomke, 2020).
  • Travel site Booking.com shared that 9 out of 10 of their ideas fail to improve metrics (Thomke, 2020).

To be sure, the statistics cited above reflect a tiny subset of the ideas implemented by companies. Further, they probably reflect a particular class of ideas: those that are conducive to experimentation2 such as changes to user interfaces, new ad creatives, subtle messaging variants, and so on. Moreover, the companies represented are all relatively young and either in the tech sector or leverage technology as a medium for their business. This is far from a random sample of all companies and business ideas. So, while it’s possible that the high failure rates are specific to the types of companies and ideas that are convenient to test experimentally, it seems more plausible that the high failure rates are reflective of business ideas in general and that the disparity in perception of their success can be attributed to the method of measurement. We shouldn’t be surprised; high failure rates are common in many domains. Venture capitalists invest in many companies because most fail; similarly, most stock portfolio managers fail to outperform the S&P 500; in biology, most mutations are unsuccessful; and so on. The more surprising aspect of the low success rates for business ideas is most of us don’t seem to know about it.

(2) Most companies are unaware of the low success rates for their business ideas

Those statistics should be sobering to any organization. Collectively, business ideas represent the roadmap companies rely upon to hit their goals and objectives. However, the dismal failure rates appear to be known only to the few companies that regularly conduct experiments to scientifically measure the impact of their ideas. Most companies do not appear to employ such a practice and seem to have the impression that all or most of their ideas are or will be successful. Planners, strategists, and functional leaders rarely convey any doubts about their ideas. To the contrary, they set expectations on the predicted impact of their ideas and plan for them as if they are certain. They attach revenue goals and even their own bonuses to those predictions. But, how much do they really know about the outcomes of those ideas? If they don’t have an experimentation practice, they likely know very little about the impact their roadmap is actually having.

Without experimentation, companies either don’t measure the outcomes of their ideas at all or use flimsy methods to assess their impacts. In some situations, ideas are acted upon so fluidly that they are not recognized as something that merits measurement.  For example, in some companies an idea such as “we’ll lower prices to increase demand by 10%” might be made on a whim by a marketing exec and there will be no follow up at all to see if it had the expected impact on demand. In other situations, a post-implementation assessment of a business idea is done, but in terms of execution, not impact (“Was it implemented on time?” “Did it meet requirements?” etc., not “What was the causal impact on business metrics?”). In other cases still, post hoc analysis is performed in an attempt to quantify the impact of the idea. But, this is often done using subjective or less-than-rigorous methods to justify the idea as a success. That is, the team responsible for doing the analysis often is motivated either implicitly or explicitly to find evidence of success. Bonuses are often tied to the outcomes of business ideas. Or, perhaps the VP whose idea it was is the one commissioning the analysis. In either case, there is a strong motivation to find success. For example, a company may seek qualitative customer feedback on the new loyalty program in order to craft a narrative for how it is received. Yet, the customers willing to give feedback are often biased towards the positive. Even if more objective feedback were to be acquired it would still not be a measure of impact; customers often behave differently from the sentiments they express. In still other cases, empirical analysis is performed on transaction data in an attempt to quantify the impact. But, without experimentation, at best, such analysis can only capture correlation—not causation. Business metrics are influenced simultaneously by many factors, including random fluctuations. Without properly controlling for these factors, it can be tempting to attribute any uptick in metrics as a result of the new business idea. The combination of malleable measurements and strong incentives to show success likely explain why so many business initiatives are perceived to be successful.

By contrast, the results of experimentation are numeric and austere. They do not care about the hard work that went into executing on a business initiative. They are unswayed by well-crafted narratives, emotional reviews by customers, or an executive’s influence. In short, they are brutally honest and often hard-to-accept.3 Without experimentation, companies don’t learn the sobering truth about their high failure rate. While ignorance is bliss, it is not an effective way to run your business.

(3) It is unlikely that companies will increase the success rate for their business ideas.

At this point, you may be thinking, “we need to get better at separating the wheat from the chaff, so that we only allocate resources to the good ideas.” Sadly, without experimentation, we see little reason for optimism as there are forces that will actively work against your efforts.

One force that is actively working against us is the way we reason about our companies.

We like to reason about our businesses as if they are simple, predictable systems. We build models of their component parts and manage them as if they are levers we can pull in order to predictably manage the business to a desired state. For example, a marketer seeking to increase demand builds a model that allows her to associate each possible price with a predicted level of demand. The scope of the model is intentionally narrow so that she can isolate the impact price has on demand. Other factors like consumer perception, the competitive assortment, operational capacity, the macroeconomic landscape, and so on are out of her control and assumed to remain constant. Equipped with such an intuitive model, she can identify the price that optimizes demand. She’s in control and hitting her goal is merely a matter of execution.

However, experimentation reveals that our predictions for the impact of new business ideas can be radically off—not just a little off in terms of magnitude, but often in the completely wrong direction. We lower prices and see demand go down. We launch a new loyalty program and it hurts retention. Such unintuitive results are far more common than you might think.

The problem is that many businesses behave as complex systems which cannot be understood by studying its components in isolation. Customers, competitors, partners, market force—each can adjust in response to the intervention in ways that are not observable from simple models of the components. Just as you can’t learn about an ant colony by studying the behaviors of an individual ant (Mauboussin, 2009), the insights derived from modeling individual components of a business in isolation often have little relevance to the way the business behaves as a whole.

It’s important to note that our use of the term complex does not just mean ‘not simple.’ Complexity is an entire area of research within Systems Theory. Complexity arises in systems with many interacting agents that react and adapt to one another and their environment. Examples of complex systems include weather systems, rain forest ecology, economies, the nervous system, cities, and yes, many businesses.

Reasoning about complex systems requires a different approach. Rather than focusing on component parts, attention needs to be directed at system-wide behaviors. These behaviors are often termed “emergent,” to indicate that they are very hard to anticipate. This frame orients us around learning, not executing. It encourages more trial and error with less attachment to the outcomes of a narrow set of ideas. As complexity researcher Scott E. Page says, “An actor in a complex system controls almost nothing but influences almost everything” (Page, 2009).

An example of an attempt to manage a complex system to change behaviors

To make this tangible let’s take a look at a real example. Consider the story of the child daycare company featured in the popular book, Freakonomics (the original paper can be found here). The company faced a challenge with late pickups. The daycare closed at 4:00pm, yet parents would frequently pick up their children several minutes later. This required staff to stay late causing both expense and inconvenience. Someone in the company had a business idea to address the situation: a fine for late pickups.

Many companies would simply implement the fine and not think to measure the outcome. Fortunately for the daycare, a group of researchers convinced them to run an experiment to measure the effectiveness of the policy. The daycare operates many locations which were randomly divided into test and control groups; the test sites would implement the late pickup fine while the control sites would leave things as is. The experiment ran its course and to everyone’s surprise they learned that fine actually increased the number of late pickups.

How is it possible that the business idea had the opposite effect of what was intended? There are several very plausible explanations, which we summarize below—some of these come from the paper while others are our own hypotheses.

  • The authors of the paper assert that imposing a fine makes the penalty for a late pick up explicitly clear. Parents are generally aware that late pick-ups are not condoned. But in the absence of a fine, they are unsure what the penalty may be. Some parents may have imagined a penalty much worse than the fine—e.g., expulsion from the daycare. This belief might have been an effective deterrent. But when the fine was imposed it explicitly quantified that amount of the penalty for the late pickups (roughly equivalent to $2.75 in 1998 dollars). For some parents this was a sigh of relief—expulsion was not on the docket. One merely has to pay a fine for the transgression, making the cost of a late pickup less than what was believed. Hence, late pick-ups increase (Gneezy & Rustichini, 2000).

  • Another explanation from the paper involves social norms. Many parents may have considered late pickups as socially inappropriate and would therefore go through great lengths to avoid them (leaving work early, scrambling for backup coverage, etc). The fine however, provides an easier way to stay in good social standing. It’s as if it signals ‘late pickups are not condoned. But if you pay us the fine you are forgiven. Therefore, the fine acts as the price to pay to stay in good standing. For some parents this price is low relative to the arduous and diligent planning required to prevent a late pickup. Hence, late pickups increase in the presence of the fine (Gneezy & Rustichini, 2000).

  • Still another explanation (which was only alluded to in the paper) has to do with the perceived cost structure associated with the staff having to stay late. From the parent’s perspective, the burden to the daycare of a late pickup might be considered fixed. If there is already at least one other parent also running late then there is no extra burden imposed since staff already has to stay. As surmised by the other explanations above, the fine increases the number of late pickups, which, therefore increases the probability that staff will have to stay late due to some other parent’s tardiness. Thus, one extra late pickup is no additional burden. Late pickups increase further.

  • One of our own explanations has to do with social norms thresholds. Each parent has a threshold for the appropriateness for late pickups based on social norms. The threshold might be the number of other parents observed or believed to be doing late pickups before such activity is believed to be appropriate. I.e., if others are doing it, it must be okay. (Note: this signal of appropriateness is independent from the perceived fixed cost structure mentioned above.) Since the fine increased the number of late pickups for some parents, other parents observed more late pickups and then followed suit.

The above are plausible explanations for the observed outcome. Some may even seem obvious in hindsight.4 However, these behaviors are extremely difficult to anticipate by focusing your attention on an individual component part: the fine.  Such surprising outcomes are less rare than you might think. In this case, the increase in late pickups might have been so apparent that they could have been detected even without the experiment. However, the impact of many ideas often go undetected.

Another force that is actively working against our efforts to discern good ideas from bad is our cognitive biases. You might be thinking: “Thank goodness my company has processes that filter away bad ideas, so that we only invest in great ideas!” Unfortunately, all companies probably try hard to select only the best ideas, and yet we assert that they are not particularly successful at separating good from bad ideas. We suggest that this is because these processes are deeply human in nature, leaving them vulnerable to cognitive biases.

Cognitive biases are systematic errors in human thinking and decision making (Tversky & Kahneman, 1974). They result from the core thinking and decision making processes that we developed over our evolutionary history. Unfortunately, evolution adapted us to an environment with many differences from the modern world. This can lead to a habit of poor decision making. To illustrate: we know that a healthy bundle of kale is better for our bodies than a big juicy burger. Yet, we have an innate preference for the burger. Many of us will decide to eat the burger tonight. And tomorrow night. And again next week. We know we shouldn’t. But yet our society continues consuming too much meat, fat, and sugar. Obesity is now a major public health problem. Why are we doing this to ourselves? Why are we imbued with such a strong urge—a literal gut instinct—to repeatedly make decisions that have negative consequences for us? It’s because meat, fat, and sugar were scarce and precious resources for most of our evolutionary history. Consuming these resources at every opportunity was an adaptive behavior, and so humans evolved a strong desire to do so. Unfortunately, we remain imbued with this desire despite the modern world’s abundance of burger joints.

Cognitive biases are predictable and pervasive. We fall prey to them despite believing that we are rational and objective thinkers. Business leaders (ourselves included) are not immune. These biases compromise our ability to filter out bad business ideas. They can also make us feel extremely confident as we make a bad business decision. See the following sidebar for examples of cognitive biases manifesting in business environments and producing bad decisions.

Cognitive bias examples

Group Think (Whyte, 1952) describes our tendency to converge towards shared opinions when we gather in groups. This emerges from a very human impulse to conform. Group cohesion was important in our evolutionary past. You might have observed this bias during a prioritization meeting: The group entered with disparate, weakly held opinions, but exited with a consensus opinion, which everyone felt confident about.  As a hypothetical example: A meeting is called to discuss a disagreement between two departments. Members of the departments have differing but strong opinions, based on solid lines of reasoning and evidence. But once the meeting starts the attendees begin to self censor. Nobody wants to look difficult. One attendee recognizes a gaping flaw in the “other side’s” analysis, but they don’t want to make their key cross functional partner look bad in front of their boss. Another attendee may have thought the idea was too risky, but, because the responsibility for the idea is now diffused across everyone in the meeting, won’t be her fault if the project fails and so she acquiesces. Finally, a highly admired senior executive speaks up and everyone converges towards this position (in business lingo we just heard the HiPPO or Highest Paid Person’s Opinion; or in the scientific vernacular, the Authority Bias (Milgram, 1963). These social pressures will have collectively stifled the meaningful debate that could have filtered out a bad business decision.

The Sunk Cost bias (Arkes & Blumer, 1985) describes our tendency to justify new investments via past expenditures. In colloquial terms, it’s our tendency to throw good money after bad. We suspect you’ve seen this bias more than a few times in the workplace. As another hypothetical example: A manager is deciding what their team will prioritize over the next fiscal year. They naturally think about incremental improvements that they could make to their team’s core product. This product is based on a compelling idea, however, it hasn’t yet delivered the impact that everyone expected. But, the manager has spent so much time and effort building organizational momentum behind the product. The manager gave presentations about it to senior leadership and painstakingly cultivated a sense of excitement about it with their cross functional partners. As a result, the manager decides to prioritize incremental work on the existing product, without properly investigating a new idea that would have yielded much more impact. In this case, the manager’s decision was driven by thinking about the sunk costs associated with the existing system. This created a barrier to innovation and yielded a bad business decision.

The Confirmation Bias (Nickerson, 1998) describes our tendency to focus upon evidence that confirms our beliefs, while discounting evidence that challenges our beliefs. We’ve certainly fallen prey to this bias in our personal and professional lives. As a hypothetical example: An exec wonders ‘should we implement a loyalty program to improve client retention?’ They find a team member who thinks this sounds like a good idea. So the exec asks the team member to do some market research to inform whether the company should create their own loyalty program. The team member looks for examples of highly successful loyalty programs from other companies. Why look for examples of bad programs? This company has no intention of implementing a bad loyalty program. Also, the team member wants to impress the exec by describing all the opportunities that could be unlocked with this program. They want to demonstrate their abilities as a strategic thinker. They might even get to lead the implementation of the program, which could be great for their career. As a result, the team member builds a presentation that emphasizes positive examples and opportunities, while discounting negative examples and risks. This presentation leads the exec to overestimate the probability that this initiative will improve client retention, and thus fail to filter out a bad business decision.

The biases we’ve listed above are just a sample of the extensive and well documented set of cognitive biases (e.g., Availability Bias, Survivorship Bias, Dunning-Kruger effect, etc.) that limit business leaders’ ability to identify and implement only successful business initiatives. Awareness of these biases can decrease our probability of committing them. However, awareness isn’t a silver bullet. We have a desk mat in our office that lists many of these cognitive biases. We regret to report that we often return to our desks, stare down at the mat … and realize that we’ve just fallen prey to another bias. 

A final force that is actively working against efforts to discern good ideas from bad is your business maturing. A thought experiment: Suppose a local high school coach told NBA superstar Stephen Curry how to adjust his jump shot. Would implementing these changes improve or hurt his performance? It is hard to imagine it would help. Now, suppose the coach gave this advice to a local 6th grader. It seems likely that it would help the kid’s game.

Now, imagine a consultant telling Google how to improve their search algorithm versus advising a startup on setting up a database. It’s easier to imagine the consultant helping the startup. Why? Well, Google search is a cutting edge system that has received extensive attention from numerous world class experts—kind of like Steph Curry. It’s going to be hard to offer a new great idea. In contrast, the startup will benefit from getting pointed in a variety of good directions—kind of like a 6th grader.

To use a more analytic framework, imagine a hill which represents a company’s objective function5 like profit, revenue, or retention. The company’s goal is to climb to the peak, where it’s objective is maximized. However, the company can’t see very far in this landscape. It doesn’t know where the peak is. It can only assess (if it’s careful and uses experimentation) whether it’s going up or downhill by taking small steps in different directions—perhaps by tweaking it’s pricing strategy and measuring if revenue goes up.

When a company (or basketball player) is young, its position on this objective function (profit, etc.) landscape is low. It can step in many directions and go uphill. Through this process, a company can grow (walk up Mount Revenue). However, as it climbs the mountain, a smaller proportion of the possible directions to step will lead uphill. At the summit a step in any direction will take you downhill.

This is admittedly a simple model  of a business (and we already discussed the follies of using simple models). However, all companies will eventually face the truism that as they improve, there are fewer ways to continue to improve (the low apples have been plucked), as well as the extrinsic constraints of market saturation, commoditization, etc. that make it harder to improve your business as it matures.6

So, what to do

We’ve argued that most business ideas fail to deliver on their promised goals. We’ve also explained that there are systematic reasons that make it unlikely that companies will get better, just by trying harder. So where does this leave you? Are you destined to implement mostly bad ideas? Here are a few recommendations that might help:

  1. Run experiments and exercise your optionality. Recognize that your business may be a complex system, making it very difficult to predict how it will respond to your business ideas. Instead of rolling out your new business ideas to all customers, try them on a sample of customers as an experiment. This will show you the impact your idea has on the company. You can then make an informed decision about whether or not to roll out your idea. If your idea has a positive impact, great. Roll it out to all customers. But in the more likely event that your idea does not have the positive impact you were hoping for you can end the experiment and kill the idea. It may seem wasteful to use company resources to implement a business idea only to later kill it, but this is better than unknowingly providing on-going support to an idea that is doing nothing or actually hurting your metrics—which is what happens most of the time.
  2. Recognize your cognitive biases, collect a priori predictions, and celebrate learnings. Your company’s ability to filter out bad business ideas will be limited by your team member’s cognitive biases. You can start building a culture that appreciates this fact by sending a survey to all of a project’s stakeholders before your next big release. Ask everyone to predict how the metrics will move. Make an anonymized version of these predictions and accuracy available for employees. We expect your team members will become less confident in their predictions over time. This process may also reveal that big wins tend to emerge from a string of experiments, rather than a single stroke of inspiration. So celebrate all of the necessary stepping stones on the way to a big win.
  3. Recognize that it’s going to get harder to find successful ideas, so try more things, and get more skeptical. As your company matures, it may get harder to find ways to improve it. We see three ways to address this challenge. First, try more ideas. It will be hard to increase the success rate of your ideas, so try more ideas. Consider building a leverageable and reusable experimentation platform to increase your bandwidth. Follow the lead of the venture world: fund a lot of ideas to get a few big wins.7 Second, as your company matures, you might want to adjust the amount of evidence that is required before you roll out a change—a more mature company should require a higher degree of statistical certainty before inferring that a new feature has improved metrics. In experimental lingo, you might want to adjust the “p-value thresholds” that you use to assess an experiment. Or to use our metaphor, a 6th grader should probably just listen whenever a coach tells them to adjust their jump shot, but Steph Curry should require a lot of evidence before he adjusts his.

This may be a hard message to accept. It’s easier to assume that all of our ideas are having the positive impact that we intended. It’s more inspiring to believe that successful ideas and companies are the result of brilliance rather than trial and error. But, consider the deference we give to mother nature. She is able to produce such exquisite creatures—the giraffe, the mighty oak tree, even us humans—each so perfectly adapted to their environment that we see them as the rightful owners of their respective niches. Yet, mother nature achieves this not through grandiose ideas, but through trial and error… with a success rate far more dismal than that of our business ideas. It’s an effective strategy if we can convince our egos to embrace it.


References

Arkes, H. R., & Blumer, C. (1985), The psychology of sunk costs. Organizational Behavior and Human Decision Processes, 35, 124-140.

Gneezy, U., & Rustichini, A. (2000). A Fine is a Price. The Journal of Legal Studies, 29(1), 1-17. doi:10.1086/468061

Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise: A failure to disagree. American Psychologist, 64(6), 515–526. https://doi.org/10.1037/a0016755

Kohavi, R. & Thomke, S. “The Surprising Power of Online Experiments,” Harvard Business Review 95, no. 5 (September-October 2017)

Mauboussin, M. J. (2009). Think Twice: Harnessing the Power of Counterintuition. Harvard Business Review Press.

Milgram, S. (1963). “Behavioral Study of obedience”. The Journal of Abnormal and Social Psychology. 67 (4): 371–378.

Moran, M. Do It Wrong Quickly: How the Web Changes the Old Marketing Rules . s.l. : IBM Press, 2007. 0132255960.

Nickerson, R. S. (1998), “Confirmation bias: A ubiquitous phenomenon in many guises”, Review of General Psychology, 2 (2): 175–220.

Page, S. E. (2009). Understanding Complexity – The Great Courses – Lecture Transcript and Course Guidebook (1st ed.). The Teaching Company.

Thomke, S. H. (2020). Experimentation Works: The Surprising Power of Business Experiments. Harvard Business Review Press.

Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124-1131.

Whyte, W. H., (1952). “Groupthink”. Fortune, 114-117, 142, 146.


Footnotes

  1. Do not confuse the term ‘test’ to mean a process by which a nascent idea is vetted to get feedback. In an experiment, the test group receives a full-featured implementation of an idea. The goal of the experiment is to measure impact—not get feedback.
  2. In some cases there may be insufficient sample size, ethical concerns, lack of a suitable control group, and many other conditions that can inhibit experimentation.
  3. Even trained statisticians can fall victim to pressures to cajole the data. “P-hacking”, “significance chasing” and other terms refer to the temptation to use flawed methods in statistical analysis.
  4. We believe that these types of factors are only obvious in hindsight because the signals are often unobserved until we know to look for them (Kahneman & Klein, 2009).
  5. One reason among many why this mental picture is oversimplified is that it implicitly takes business conditions and the world at large to be static—the company “state vector” that maximizes the objective function today is the same as what maximizes the objective function tomorrow. In other words, it ignores that, in reality, the hill is changing shape under our feet as we try to climb it. Still, it’s a useful toy model.
  6. Finding a new market (jumping to a new “hill” in the “Mount Revenue” metaphor), as recommended in the next section, is one way to continue improving business metrics even as your company matures.
  7. VCs are able to learn about the outcomes of the startups even without experimentation. This is because the outcomes are far more readily apparent than that of business ideas. It’s difficult to cajole results to show a successful outcome when the company is out of business.

MLOps and DevOps: Why Data Makes It Different [Radar]

Much has been written about struggles of deploying machine learning projects to production. As with many burgeoning fields and disciplines, we don’t yet have a shared canonical infrastructure stack or best practices for developing and deploying data-intensive applications. This is both frustrating for companies that would prefer making ML an ordinary, fuss-free value-generating function like software engineering, as well as exciting for vendors who see the opportunity to create buzz around a new category of enterprise software.

The new category is often called MLOps. While there isn’t an authoritative definition for the term, it shares its ethos with its predecessor, the DevOps movement in software engineering: by adopting well-defined processes, modern tooling, and automated workflows, we can streamline the process of moving from development to robust production deployments. This approach has worked well for software development, so it is reasonable to assume that it could address struggles related to deploying machine learning in production too.

However, the concept is quite abstract. Just introducing a new term like MLOps doesn’t solve anything by itself, rather, it just adds to the confusion. In this article, we want to dig deeper into the fundamentals of machine learning as an engineering discipline and outline answers to key questions:

  1. Why does ML need special treatment in the first place? Can’t we just fold it into existing DevOps best practices?
  2. What does a modern technology stack for streamlined ML processes look like?
  3. How can you start applying the stack in practice today?

Why: Data Makes It Different

All ML projects are software projects. If you peek under the hood of an ML-powered application, these days you will often find a repository of Python code. If you ask an engineer to show how they operate the application in production, they will likely show containers and operational dashboards—not unlike any other software service.

Since software engineers manage to build ordinary software without experiencing as much pain as their counterparts in the ML department, it begs the question: should we just start treating ML projects as software engineering projects as usual, maybe educating ML practitioners about the existing best practices?

Let’s start by considering the job of a non-ML software engineer: writing traditional software deals with well-defined, narrowly-scoped inputs, which the engineer can exhaustively and cleanly model in the code. In effect, the engineer designs and builds the world wherein the software operates.

In contrast, a defining feature of ML-powered applications is that they are directly exposed to a large amount of messy, real-world data which is too complex to be understood and modeled by hand.

This characteristic makes ML applications fundamentally different from traditional software. It has far-reaching implications as to how such applications should be developed and by whom:

  1. ML applications are directly exposed to the constantly changing real world through data, whereas traditional software operates in a simplified, static, abstract world which is directly constructed by the developer.
  2. ML apps need to be developed through cycles of experimentation: due to the constant exposure to data, we don’t learn the behavior of ML apps through logical reasoning but through empirical observation.
  3. The skillset and the background of people building the applications gets realigned: while it is still effective to express applications in code, the emphasis shifts to data and experimentation—more akin to empirical science—rather than traditional software engineering.

This approach is not novel. There is a decades-long tradition of data-centric programming: developers who have been using data-centric IDEs, such as RStudio, Matlab, Jupyter Notebooks, or even Excel to model complex real-world phenomena, should find this paradigm familiar. However, these tools have been rather insular environments: they are great for prototyping but lacking when it comes to production use.

To make ML applications production-ready from the beginning, developers must adhere to the same set of standards as all other production-grade software. This introduces further requirements:

  1. The scale of operations is often two orders of magnitude larger than in the earlier data-centric environments. Not only is data larger, but models—deep learning models in particular—are much larger than before.
  2. Modern ML applications need to be carefully orchestrated: with the dramatic increase in the complexity of apps, which can require dozens of interconnected steps, developers need better software paradigms, such as first-class DAGs.
  3. We need robust versioning for data, models, code, and preferably even the internal state of applications—think Git on steroids to answer inevitable questions: What changed? Why did something break? Who did what and when? How do two iterations compare?
  4. The applications must be integrated to the surrounding business systems so ideas can be tested and validated in the real world in a controlled manner.

Two important trends collide in these lists. On the one hand we have the long tradition of data-centric programming; on the other hand, we face the needs of modern, large-scale business applications. Either paradigm is insufficient by itself: it would be ill-advised to suggest building a modern ML application in Excel. Similarly, it would be pointless to pretend that a data-intensive application resembles a run-off-the-mill microservice which can be built with the usual software toolchain consisting of, say, GitHub, Docker, and Kubernetes.

We need a new path that allows the results of data-centric programming, models and data science applications in general, to be deployed to modern production infrastructure, similar to how DevOps practices allows traditional software artifacts to be deployed to production continuously and reliably. Crucially, the new path is analogous but not equal to the existing DevOps path.

What: The Modern Stack of ML Infrastructure

What kind of foundation would the modern ML application require? It should combine the best parts of modern production infrastructure to ensure robust deployments, as well as draw inspiration from data-centric programming to maximize productivity.

While implementation details vary, the major infrastructural layers we’ve seen emerge are relatively uniform across a large number of projects. Let’s now take a tour of the various layers, to begin to map the territory. Along the way, we’ll provide illustrative examples. The intention behind the examples is not to be comprehensive (perhaps a fool’s errand, anyway!), but to reference concrete tooling used today in order to ground what could otherwise be a somewhat abstract exercise.


Adapted from the book Effective Data Science Infrastructure

Foundational Infrastructure Layers

Data

Data is at the core of any ML project, so data infrastructure is a foundational concern. ML use cases rarely dictate the master data management solution, so the ML stack needs to integrate with existing data warehouses. Cloud-based data warehouses, such as Snowflake, AWS’ portfolio of databases like RDS, Redshift or Aurora, or an S3-based data lake, are a great match to ML use cases since they tend to be much more scalable than traditional databases, both in terms of the data set sizes as well as query patterns.

Compute

To make data useful, we must be able to conduct large-scale compute easily. Since the needs of data-intensive applications are diverse, it is useful to have a general-purpose compute layer that can handle different types of tasks from IO-heavy data processing to training large models on GPUs. Besides variety, the number of tasks can be high too: imagine a single workflow that trains a separate model for 200 countries in the world, running a hyperparameter search over 100 parameters for each model—the workflow yields 20,000 parallel tasks.

Prior to the cloud, setting up and operating a cluster that can handle workloads like this would have been a major technical challenge. Today, a number of cloud-based, auto-scaling systems are easily available, such as AWS Batch. Kubernetes, a popular choice for general-purpose container orchestration, can be configured to work as a scalable batch compute layer, although the downside of its flexibility is increased complexity. Note that container orchestration for the compute layer is not to be confused with the workflow orchestration layer, which we will cover next.

Orchestration

The nature of computation is structured: we must be able to manage the complexity of applications by structuring them, for example, as a graph or a workflow that is orchestrated.

The workflow orchestrator needs to perform a seemingly simple task: given a workflow or DAG definition, execute the tasks defined by the graph in order using the compute layer. There are countless systems that can perform this task for small DAGs on a single server. However, as the workflow orchestrator plays a key role in ensuring that production workflows execute reliably, it makes sense to use a system that is both scalable and highly available, which leaves us with a few battle-hardened options, for instance: Airflow, a popular open-source workflow orchestrator; Argo, a newer orchestrator that runs natively on Kubernetes, and managed solutions such as Google Cloud Composer and AWS Step Functions.

Software Development Layers

While these three foundational layers, data, compute, and orchestration, are technically all we need to execute ML applications at arbitrary scale, building and operating ML applications directly on top of these components would be like hacking software in assembly language: technically possible but inconvenient and unproductive. To make people productive, we need higher levels of abstraction. Enter the software development layers.

Versioning

ML app and software artifacts exist and evolve in a dynamic environment. To manage the dynamism, we can resort to taking snapshots that represent immutable points in time: of models, of data, of code, and of internal state. For this reason, we require a strong versioning layer.

While Git, GitHub, and other similar tools for software version control work well for code and the usual workflows of software development, they are a bit clunky for tracking all experiments, models, and data. To plug this gap, frameworks like Metaflow or MLFlow provide a custom solution for versioning.

Software Architecture

Next, we need to consider who builds these applications and how. They are often built by data scientists who are not software engineers or computer science majors by training. Arguably, high-level programming languages like Python are the most expressive and efficient ways that humankind has conceived to formally define complex processes. It is hard to imagine a better way to express non-trivial business logic and convert mathematical concepts into an executable form.

However, not all Python code is equal. Python written in Jupyter notebooks following the tradition of data-centric programming is very different from Python used to implement a scalable web server. To make the data scientists maximally productive, we want to provide supporting software architecture in terms of APIs and libraries that allow them to focus on data, not on the machines.

Data Science Layers

With these five layers, we can present a highly productive, data-centric software interface that enables iterative development of large-scale data-intensive applications. However, none of these layers help with modeling and optimization. We cannot expect data scientists to write modeling frameworks like PyTorch or optimizers like Adam from scratch! Furthermore, there are steps that are needed to go from raw data to features required by models.

Model Operations

When it comes to data science and modeling, we separate three concerns, starting from the most practical progressing towards the most theoretical. Assuming you have a model, how can you use it effectively? Perhaps you want to produce predictions in real-time or as a batch process. No matter what you do, you should monitor the quality of the results. Altogether, we can group these practical concerns in the model operations layer. There are many new tools in this space helping with various aspects of operations, including Seldon for model deployments, Weights and Biases for model monitoring, and TruEra for model explainability.

Feature Engineering

Before you have a model, you have to decide how to feed it with labelled data. Managing the process of converting raw facts to features is a deep topic of its own, potentially involving feature encoders, feature stores, and so on. Producing labels is another, equally deep topic. You want to carefully manage consistency of data between training and predictions, as well as make sure that there’s no leakage of information when models are being trained and tested with historical data. We bucket these questions in the feature engineering layer. There’s an emerging space of ML-focused feature stores such as Tecton or labeling solutions like Scale and Snorkel. Feature stores aim to solve the challenge that many data scientists in an organization require similar data transformations and features for their work and labeling solutions deal with the very real challenges associated with hand labeling datasets.

Model Development

Finally, at the very top of the stack we get to the question of mathematical modeling: What kind of modeling technique to use? What model architecture is most suitable for the task? How to parameterize the model? Fortunately, excellent off-the-shelf libraries like scikit-learn and PyTorch are available to help with model development.

An Overarching Concern: Correctness and Testing

Regardless of the systems we use at each layer of the stack, we want to guarantee the correctness of results. In traditional software engineering we can do this by writing tests: for instance, a unit test can be used to check the behavior of a function with predetermined inputs. Since we know exactly how the function is implemented, we can convince ourselves through inductive reasoning that the function should work correctly, based on the correctness of a unit test.

This process doesn’t work when the function, such as a model, is opaque to us. We must resort to black box testing—testing the behavior of the function with a wide range of inputs. Even worse, sophisticated ML applications can take a huge number of contextual data points as inputs, like the time of day, user’s past behavior, or device type into account, so an accurate test set up may need to become a full-fledged simulator.

Since building an accurate simulator is a highly non-trivial challenge in itself, often it is easier to use a slice of the real-world as a simulator and A/B test the application in production against a known baseline. To make A/B testing possible, all layers of the stack should be be able to run many versions of the application concurrently, so an arbitrary number of production-like deployments can be run simultaneously. This poses a challenge to many infrastructure tools of today, which have been designed for more rigid traditional software in mind. Besides infrastructure, effective A/B testing requires a control plane, a modern experimentation platform, such as StatSig.

How: Wrapping The Stack For Maximum Usability

Imagine choosing a production-grade solution for each layer of the stack: for instance, Snowflake for data, Kubernetes for compute (container orchestration), and Argo for workflow orchestration. While each system does a good job at its own domain, it is not trivial to build a data-intensive application that has cross-cutting concerns touching all the foundational layers. In addition, you have to layer the higher-level concerns from versioning to model development on top of the already complex stack. It is not realistic to ask a data scientist to prototype quickly and deploy to production with confidence using such a contraption. Adding more YAML to cover cracks in the stack is not an adequate solution.

Many data-centric environments of the previous generation, such as Excel and RStudio, really shine at maximizing usability and developer productivity. Optimally, we could wrap the production-grade infrastructure stack inside a developer-oriented user interface. Such an interface should allow the data scientist to focus on concerns that are most relevant for them, namely the topmost layers of stack, while abstracting away the foundational layers.

The combination of a production-grade core and a user-friendly shell makes sure that ML applications can be prototyped rapidly, deployed to production, and brought back to the prototyping environment for continuous improvement. The iteration cycles should be measured in hours or days, not in months.

Over the past five years, a number of such frameworks have started to emerge, both as commercial offerings as well as in open-source.

Metaflow is an open-source framework, originally developed at Netflix, specifically designed to address this concern (disclaimer: one of the authors works on Metaflow): How can we wrap robust production infrastructure in a single coherent, easy-to-use interface for data scientists? Under the hood, Metaflow integrates with best-of-the-breed production infrastructure, such as Kubernetes and AWS Step Functions, while providing a development experience that draws inspiration from data-centric programming, that is, by treating local prototyping as the first-class citizen.

Google’s open-source Kubeflow addresses similar concerns, although with a more engineer-oriented approach. As a commercial product, Databricks provides a managed environment that combines data-centric notebooks with a proprietary production infrastructure. All cloud providers provide commercial solutions as well, such as AWS Sagemaker or Azure ML Studio.

While these solutions, and many less known ones, seem similar on the surface, there are many differences between them. When evaluating solutions, consider focusing on the three key dimensions covered in this article:

  1. Does the solution provide a delightful user experience for data scientists and ML engineers? There is no fundamental reason why data scientists should accept a worse level of productivity than is achievable with existing data-centric tools.
  2. Does the solution provide first-class support for rapid iterative development and frictionless A/B testing? It should be easy to take projects quickly from prototype to production and back, so production issues can be reproduced and debugged locally.
  3. Does the solution integrate with your existing infrastructure, in particular to the foundational data, compute, and orchestration layers? It is not productive to operate ML as an island. When it comes to operating ML in production, it is beneficial to be able to leverage existing production tooling for observability and deployments, for example, as much as possible.

It is safe to say that all existing solutions still have room for improvement. Yet it seems inevitable that over the next five years the whole stack will mature, and the user experience will converge towards and eventually beyond the best data-centric IDEs.  Businesses will learn how to create value with ML similar to traditional software engineering and empirical, data-driven development will take its place amongst other ubiquitous software development paradigms.

The Quality of Auto-Generated Code [Radar]

Kevlin Henney and I were riffing on some ideas about GitHub Copilot, the tool for automatically generating code base on GPT-3’s language model, trained on the body of code that’s in GitHub. This article poses some questions and (perhaps) some answers, without trying to present any conclusions.

First, we wondered about code quality. There are lots of ways to solve a given programming problem; but most of us have some ideas about what makes code “good” or “bad.” Is it readable, is it well-organized? Things like that.  In a professional setting, where software needs to be maintained and modified over long periods, readability and organization count for a lot.

We know how to test whether or not code is correct (at least up to a certain limit). Given enough unit tests and acceptance tests, we can imagine a system for automatically generating code that is correct. Property-based testing might give us some additional ideas about building test suites robust enough to verify that code works properly. But we don’t have methods to test for code that’s “good.” Imagine asking Copilot to write a function that sorts a list. There are lots of ways to sort. Some are pretty good—for example, quicksort. Some of them are awful. But a unit test has no way of telling whether a function is implemented using quicksort, permutation sort, (which completes in factorial time), sleep sort, or one of the other strange sorting algorithms that Kevlin has been writing about.

Do we care? Well, we care about O(N log N) behavior versus O(N!). But assuming that we have some way to resolve that issue, if we can specify a program’s behavior precisely enough so that we are highly confident that Copilot will write code that’s correct and tolerably performant, do we care about its aesthetics? Do we care whether it’s readable? 40 years ago, we might have cared about the assembly language code generated by a compiler. But today, we don’t, except for a few increasingly rare corner cases that usually involve device drivers or embedded systems. If I write something in C and compile it with gcc, realistically I’m never going to look at the compiler’s output. I don’t need to understand it.

To get to this point, we may need a meta-language for describing what we want the program to do that’s almost as detailed as a modern high-level language. That could be what the future holds: an understanding of “prompt engineering” that lets us tell an AI system precisely what we want a program to do, rather than how to do it. Testing would become much more important, as would understanding precisely the business problem that needs to be solved. “Slinging code” in whatever the language would become less common.

But what if we don’t get to the point where we trust automatically generated code as much as we now trust the output of a compiler? Readability will be at a premium as long as humans need to read code. If we have to read the output from one of Copilot’s descendants to judge whether or not it will work, or if we have to debug that output because it mostly works, but fails in some cases, then we will need it to generate code that’s readable. Not that humans currently do a good job of writing readable code; but we all know how painful it is to debug code that isn’t readable, and we all have some concept of what “readability” means.

Second: Copilot was trained on the body of code in GitHub. At this point, it is all (or almost all) written by humans. Some of it is good, high quality, readable code; a lot of it isn’t. What if Copilot became so successful that Copilot-generated code came to constitute a significant percentage of the code on GitHub? The model will certainly need to be re-trained from time to time. So now, we have a feedback loop: Copilot trained on code that has been (at least partially) generated by Copilot. Does code quality improve? Or does it degrade? And again, do we care, and why?

This question can be argued either way. People working on automated tagging for AI seem to be taking the position that iterative tagging leads to better results: i.e., after a tagging pass, use a human-in-the-loop to check some of the tags, correct them where wrong, and then use this additional input in another training pass. Repeat as needed. That’s not all that different from current (non-automated) programming: write, compile, run, debug, as often as needed to get something that works. The feedback loop enables you to write good code.

A human-in-the-loop approach to training an AI code generator is one possible way of getting “good code” (for whatever “good” means)—though it’s only a partial solution. Issues like indentation style, meaningful variable names, and the like are only a start. Evaluating whether a body of code is structured into coherent modules, has well-designed APIs, and could easily be understood by maintainers is a more difficult problem. Humans can evaluate code with these qualities in mind, but it takes time. A human-in-the-loop might help to train AI systems to design good APIs, but at some point, the “human” part of the loop will start to dominate the rest.

If you look at this problem from the standpoint of evolution, you see something different. If you breed plants or animals (a highly selected form of evolution) for one desired quality, you will almost certainly see all the other qualities degrade: you’ll get large dogs with hips that don’t work, or dogs with flat faces that can’t breathe properly.

What direction will automatically generated code take? We don’t know. Our guess is that, without ways to measure “code quality” rigorously, code quality will probably degrade. Ever since Peter Drucker, management consultants have liked to say, “If you can’t measure it, you can’t improve it.” And we suspect that applies to code generation, too: aspects of the code that can be measured will improve, aspects that can’t won’t.  Or, as the accounting historian H. Thomas Johnson said, “Perhaps what you measure is what you get. More likely, what you measure is all you’ll get. What you don’t (or can’t) measure is lost.”

We can write tools to measure some superficial aspects of code quality, like obeying stylistic conventions. We already have tools that can “fix” fairly superficial quality problems like indentation. But again, that superficial approach doesn’t touch the more difficult parts of the problem. If we had an algorithm that could score readability, and restrict Copilot’s training set to code that scores in the 90th percentile, we would certainly see output that looks better than most human code. Even with such an algorithm, though, it’s still unclear whether that algorithm could determine whether variables and functions had appropriate names, let alone whether a large project was well-structured.

And a third time: do we care? If we have a rigorous way to express what we want a program to do, we may never need to look at the underlying C or C++. At some point, one of Copilot’s descendants may not need to generate code in a “high level language” at all: perhaps it will generate machine code for your target machine directly. And perhaps that target machine will be Web Assembly, the JVM, or something else that’s very highly portable.

Do we care whether tools like Copilot write good code? We will, until we don’t. Readability will be important as long as humans have a part to play in the debugging loop. The important question probably isn’t “do we care”; it’s “when will we stop caring?” When we can trust the output of a code model, we’ll see a rapid phase change.  We’ll care less about the code, and more about describing the task (and appropriate tests for that task) correctly.

Radar trends to watch: October 2021 [Radar]

The unwilling star of this month’s trends is clearly Facebook. Between reports that they knew about the damage that their applications were causing long before that damage hit the news, their continued denials and apologies, and their attempts to block researchers from studying the consequences of their products, they’ve been in the news almost every day. Perhaps the most interesting item, though, is the introduction of Ray-Ban Stories, a pair of sunglasses with a camera built in. We’ve been talking about virtual and augmented reality for years; when will it enter the mainstream? Will Stories be enough to make it cool, or will it have the same fate as Google Glass?

AI

  • Researchers at Samsung and Harvard are proposing to copy the neuronal interconnections of parts of the brain, and “paste” them onto a semiconductor array, creating an integrated circuit that directly models the brain’s interconnections.
  • Using AI to understand “lost” languages, written languages that we don’t know how to translate, isn’t just about NLP; it sometimes requires deciphering damaged texts (such as eroded stone tablets) where humans can no longer recognize the written characters.
  • Inaccurate face recognition is preventing people from getting necessary government aid, and there are few (if any) systems for remediation.
  • DeepMind has been studying techniques for making the output of language generation models like GPT-3 less toxic, and found that there are no good solutions.
  • Apple is working on iPhone features to detect depression, cognitive decline, and autism.  A phone that plays psychiatrist is almost certainly a bad idea. How intrusive do you want your phone to be?
  • Reservoir computing is a neural network technique that has been used to solve computationally difficult problems in dynamic systems. It is very resource intensive, but recent work has led to speedups by factors of up to a million. It may be the next step forward in AI.
  • Can AI be used to forecast (and even plan) the future of scientific research?  Not yet, but one group is working on analyzing the past 10 years of research for NASA’s Decadal Survey.
  • There have been many articles about using AI to read X-Rays. This one covers an experiment that uses training data from multiple sources to reduce one of the problems plaguing this technology: different X-ray machines, different calibration, different staff. It also places a human radiologist in the loop; the AI is only used to detect areas of possible abnormality.
  • It isn’t a surprise, but undergraduates who are studying data science receive little training in ethics, including issues like privacy and systemic bias.
  • Stanford’s Institute for Human-Centered Artificial Intelligence is creating a group to study the impact of “foundational” models on issues like bias and fairness. Foundational models are very large models like GPT-3 on which other models are built. Problems with foundational models are easily inherited by models on top of them.
  • Can machine learning learn to unlearn?  That may be required by laws like GDPR and the European “right to be forgotten.” Can a model be trained to eliminate the influence of some of its training data, without being retrained from the beginning?
  • Deep Mind’s technology for up-scaling image resolution looks really good. It produces excellent high-resolution images from pixelated originals, works on natural scenes as well as portraits, and they appear to have used a good number of Black people as models.
  • Amazon has announced details about Astro, its home robot. But questions remain: is this a toy? A data collection ploy? I don’t know that we need something that follows you around playing podcasts. It integrates with Amazon products like Ring and Alexa Guard.

Security

  • Is self-healing cybersecurity possible by killing affected containers and starting new ones? That’s an interesting partial solution to cloud security, though it only comes into play after an attack has succeeded.
  • With three months to go in 2021, we’ve already seen a record number of zero-day exploits. Is this a crisis? Or is it good news, because bad actors are discovered more effectively? One thing is clear: discovering new 0days is becoming more difficult, making them more valuable.
  • The FBI had the decryption key for the Kaseya ransomware attack, but delayed sharing it with victims for three weeks. The FBI claims it withheld the key because it was planning a counterattack against the REvil group, which disappeared before the attack was executed.
  • Privacy for the masses? iOS 15 has a beta “private relay” feature that appears to be something like TOR. And Nahoft, an application for use in Iran, encodes private messages as sequences of innocuous words that can get by automated censors.
  • HIPv2 is an alternative to TLS that is designed for implementing zero-trust security for embedded devices.
  • Kubescape is an open source tool to test whether Kubernetes has been deployed securely.  The tests are based on the NSA’s guidance for hardening Kubernetes.
  • Rootkits are hardly new, but now they’re being used to attack containers. Their goal is usually to mine bitcoin, and to hide that mining from monitoring tools. Tracee is a new tool, built with eBPF, that may help to detect successful attacks.

User Interfaces

  • Kids these days don’t understand files and directories. Seriously, Google’s dominance in everyday life means that users expect to find things through search. But search is often inadequate. It will be important for software designers to think through these issues.
  • Holograms you can touch? Aerohaptics uses jets of air to create the feeling of “touch” when interacting with a hologram. Another step towards the Star Trek Holodeck.
  • Fraunhofer has developed a system for detecting whether a driver is tired or asleep.  Software like this will be particularly important for semi-automated driving systems, which require support from a human driver.

Programming

  • What is property based testing, anyway? Fuzzing? Unit tests at scale? Greater testing discipline will be required if we expect AI systems to generate code. Can property-based testing get there?
  • Google Cloud has introduced Supply Chain Twin, a “digital twin” service for supply chain management.
  • Open VSCodeServer is an open source project that allows VSCode to run on a remote machine and be accessed through a web browser.
  • Ent is an open source object-relational mapping tool for Go that uses graph concepts to model the database schema. Facebook has contributed Ent to the CNCF.
  • Glean is an open source search engine for source code.  Looks like it’s a LOT better than grepping through your src directories.
  • Urbit looks like it could be an interesting operating system for decentralized peer-to-peer applications.

Law

  • Facebook on regulation: Please require competitors to do the things we do. And don’t look at targeted advertising.
  • NFTs, generative art, and open source: do we need a new kind of license to protect artistic works that are generated by software?
  • China issues a Request for Comments on their proposed social media regulations. Google Translate’s translation isn’t bad, and CNBC has a good summary. Users must be notified about the use of algorithmic recommendations; users must be able to disable recommendations; and algorithmic recommendations must not be designed to create addictive behavior.
  • South Korea has passed a law that will force Apple and Google to open their devices to other app stores.
  • Research by Google shows that, worldwide, Government-ordered Internet shutdowns have become much more common in the past year. These shutdowns are usually to suppress dissent. India has shut down Internet access more than any other country.

Biology

  • George Church’s startup Colossal has received venture funding for developing “cold tolerant Asian elephants” (as Church puts it), a project more commonly known as de-extincting Wooly Mammoths.
  • Researchers at NYU have created artificial cell-like objects that can ingest, process, and expel objects. These aren’t artificial cells, but represent a step towards creating them.

Hardware

  • A breakthrough in building phase change memory that consumes little power may make phase change memory practical, allowing tighter integration between processors and storage.
  • Mainframes aren’t dead. The Telum is IBM’s new processor for its System Z mainframes. 7nm technology, 5 GHz base clock speed, 8 cores, 16 threads per core; it’s a very impressive chip.
  • One of Google’s X companies has deployed a 20 Gbps Internet trunk using lasers. The connection crosses the Congo River, a path that is difficult because of the river’s depth and speed.  This technology could be used in other places where running fiber is difficult.
  • Facebook and Ray-Ban have released smart glasses (branded as Ray-Ban Stories), which are eyeglasses with a built-in camera and speakers. This is not AR (there is no projector), but a step on the way. Xiaomi also appears to be working on smart glasses, and Linux is getting into the act with a work-oriented headset called Simula One.

Quantum Computing

  • IBM introduces Qiskit Nature, a platform for using quantum computers to experiment with quantum effects in the natural sciences. Because these experiments are about the behavior of quantum systems, they (probably) don’t require the error correction that’s necessary to make quantum computing viable.
  • Want to build your own quantum computer?  IBM has open sourced Qiskit Metal, a design automation tool for superconducting quantum computers.
  • Curiously-named Valleytronics uses electrons’ “valley pseudospin” to store quantum data. It might enable small, room-temperature quantum computers.

Social Media

  • Facebook has put “Instagram for Kids” on hold. While they dispute the evidence that Instagram harms teenagers, public outcry and legislative pressure, along with Facebook’s own evidence that Instagram is particularly damaging to teenage girls, has caused them to delay the release.
  • Twitter is allowing bot accounts to identify themselves as bots.  Labeling isn’t mandatory.
  • Facebook adds junk content to its HTML to prevent researchers from using automated tools to collect posts.

Ethical Social Media: Oxymoron or Attainable Goal? [Radar]

Humans have wrestled with ethics for millennia. Each generation spawns a fresh batch of ethical dilemmas and then wonders how to deal with them.

For this generation, social media has generated a vast set of new ethical challenges, which is unsurprising when you consider the degree of its influence. Social media has been linked to health risks in individuals and political violence in societies. Despite growing awareness of its potential for causing harm, social media has received what amounts to a free pass on unethical behavior.

Minerva Tantoco, who served as New York City’s first chief technology officer, suggests that “technology exceptionalism” is the root cause. Unlike the rapacious robber barons of the Gilded Age, today’s tech moguls were viewed initially as eccentric geeks who enjoyed inventing cool new products. Social media was perceived as a harmless timewaster, rather than as a carefully designed tool for relentless commerce and psychological manipulation.

“The idea of treating social media differently came about because the individuals who started it weren’t from traditional media companies,” Tantoco says. “Over time, however, the distinction between social media and traditional media has blurred, and perhaps the time has come for social media to be subject to the same rules and codes that apply to broadcasters, news outlets and advertisers. Which means that social media would be held accountable for content that causes harm or violates existing laws.”

Ethical standards that were developed for print, radio, television, and telecommunications during the 20th century could be applied to social media. “We would start with existing norms and codes for media generally and test whether these existing frameworks and laws would apply to social media,” Tantoco says.

Taking existing norms and applying them, with modifications, to novel situations is a time-honored practice.  “When e-commerce web sites first started, it was unclear if state sales taxes would apply to purchases,” Tantoco says. “It turned out that online sales were not exempt from sales taxes and that rules that had been developed for mail-order sites decades earlier could be fairly applied to e-commerce.”

Learning from AI

Christine Chambers Goodman, a professor at Pepperdine University’s Caruso School of Law, has written extensively on the topic of artificial intelligence and its impact on society. She sees potential in applying AI guidelines to social media, and she cited the European Commission’s High-Level Expert Group on Artificial Intelligence’s seven key ethical requirements for trustworthy AI:1

  • Human agency and oversight
  • Technical robustness and safety
  • Privacy and data governance
  • Transparency
  • Diversity, non-discrimination and fairness
  • Societal and environmental well-being
  • Accountability

The commission’s proposed requirements for AI would be a good starting point for conversations about ethical social media. Ideally, basic ethical components would be designed into social media platforms before they are built. Software engineers should be trained to recognize their own biases and learn specific techniques for writing code that is inherently fair and non-discriminatory.

“It starts with that first requirement of human agency and oversight,” Goodman says. If ethical standards are “paramount” during the design phase of a platform, “then I see some room for optimism.”

Colleges and universities also can play important roles in training a new generation of ethical software engineers by requiring students to take classes in ethics, she says.

Economic Fairness and Equity

Social media companies are private business entities, even when they are publicly held. But the social media phenomenon has become so thoroughly woven into the fabric of our daily lives that many people now regard it as a public utility such as gas, electricity, and water. In a remarkably brief span of time, social media has become an institution, and generally speaking, we expect our institutions to behave fairly and equitably.  Clearly, however, the social media giants see no reason to share the economic benefits of their success with anyone except their shareholders.

“The large social media companies make hundreds of billions of dollars from advertising revenue and share almost none of it with their users,” says Greg Fell, CEO of Display Social, a platform that shares up to 50 percent of its advertising revenue with content creators who post on its site.

Historically, content creators have been paid for their work. Imagine if CBS had told Lucille Ball and Desi Arnaz that they wouldn’t be paid for creating episodes of “I Love Lucy,” but that instead they would be allowed to sell “I Love Lucy” coffee mugs and T-shirts. If the original TV networks had operated like social media corporations, there never would have been a Golden Age of Television.

Most societies reward creators, artists, entertainers, athletes, and influencers for their contributions. Why does social media get to play by a different set of rules?

“Economic fairness should be part of the social media ethos. People should be rewarded financially for posting on social media, instead of being exploited by business models that are unfair and unethical,” Fell says.

From Fell’s perspective, the exploitive and unfair economic practices of the large social media companies represent short-term thinking. “Ultimately, they will burn out their audiences and implode. Meantime, they are causing harm. That’s the problem with unethical behavior—in the long run, it’s self-destructive and self-defeating.”

Transforming Attention into Revenue

Virtually all of the large social media platforms rely on some form of advertising to generate revenue. Their business models are exceedingly simple: they attract the attention of users and then sell the attention to advertisers. In crude terms, they’re selling your eyeballs to the highest bidder.

As a result, their only real interest is attracting attention. The more attention they attract, the more money they make. Their algorithms are brilliantly designed to catch and hold your attention by serving up content that will trigger dopamine rushes in your brain. Dopamine isn’t a cause of addiction, but it plays a role in addictive behaviors. So, is it fair to say that social media is intentionally addictive? Maybe.

“For many social media companies, addictive behavior (as in people consuming more than they intend to and regretting it afterwards) is the point,” says Esther Dyson, an author, philanthropist, and investor focused on health, open government, digital technology, biotechnology, and aerospace. “Cigarettes, drugs, and gambling are all premised on the model that too much is never enough.  And from the point of view of many investors, sustainable profits are not enough.  They want exits. Indeed, the goal of these investors is creating ever-growing legions of addicts. That starts with generating and keeping attention.”

Monetizing Misinformation

As it happens, misinformation is highly attractive to many users. It’s a digital version of potato chips—you can’t eat just one. The algorithms figure this out quickly, and feed users a steady supply of misinformation to hold their attention.

In an advertising-driven business model, attention equals dollars. With the help of machine learning and sophisticated algorithms, social media has effectively monetized misinformation, creating a vicious, addictive cycle that seems increasingly difficult to stop.

Social media has staked its fortunes to a business model that is deeply unethical and seems destined to fail in the long term. But could the industry survive, at least in the short term, with a business model that hews more closely to ethical norms?

Greg Fell doesn’t believe that ethical guidelines will slow the industry’s growth or reduce its profitability. “People expect fairness. They want to be treated as human beings, not as products,” he says. “You can build fairness into a platform if you make it part of your goal from the start. But it shouldn’t be an afterthought.”

Slowing the Spread of False Narratives

In addition to implementing structural design elements that would make it easier for people to recognize misinformation and false narratives, social media companies could partner with the public sector to promote media literacy.  Renée DiResta is the technical research manager at Stanford Internet Observatory, a cross-disciplinary program of research, teaching, and policy engagement for the study of abuse in current information technologies. She investigates the spread of narratives across social and traditional media networks.

“I think we need better ways for teaching people to distinguish between rhetoric and reality,” DiResta says, noting that tropes such as “dead people are voting” are commonly repeated and reused from one election cycle to the next, even when they are provably false. These kinds of tropes are the “building blocks” of misinformation campaigns designed to undermine confidence in elections, she says.

“If we can help people recognize the elements of false narratives, maybe they will build up an immunity to them,” DiResta says.

It’s Not Too Late to Stop the Train

The phenomenon we recognize today as “social media” only began taking shape in the late 1990s and early 2000s. It is barely two decades old, which makes it far too young to have developed iron-clad traditions. It is an immature field by any measure, and it’s not too late to alter its course.

Moreover, social media’s business model is not terribly complicated, and it’s easy to envision a variety of other models that might be equally or even more profitable, and represent far less of a threat to society. Newer platforms such as Substack, Patreon, OnlyFans, Buy Me a Coffee, and Display Social are opening the door to a creator-centric social media industry that isn’t fueled primarily by advertising dollars.

“Social media has its positives, and it isn’t all doom and gloom, but it certainly isn’t perfect and resolving some of these issues could ensure these applications are the fun and happy escape they need to be,” says Ella Chambers, UX designer and creator of the UK-based Ethical Social Media Project. “The majority of social media is okay.”

That said, some of the problems created by social media are far from trivial. “My research led me to conclude that the rise of social media has brought the downfall of many users’ mental health,” Chambers says. A recent series of investigative articles in the Wall Street Journal casts a harsh spotlight on the mental health risks of social media, especially to teen-age girls. Facebook has issued a rebuttal3 to the WSJ, but it’s not likely to persuade critics into believing that social media is some kind of wonderful playground for kids and teens.

Creating a practical framework of ethical guidelines would be a positive step forward. Ideally, the framework would evolve into a set of common practices and processes for ensuring fairness, diversity, inclusion, equity, safety, accuracy, accountability, and transparency in social media.

Chinese officials recently unveiled a comprehensive draft of proposed rules governing the use of recommendation algorithms in China.2 One of the proposed regulations would require algorithm providers to “respect social ethics and ethics, abide by business ethics and professional ethics, and follow the principles of fairness, openness, transparency, scientific rationality, and honesty.”

Another proposed regulation would provide users with “convenient options to turn off algorithm recommendation services” and enable users to select, modify or delete user tags. And another proposed rule would restrict service providers from using algorithms “to falsely register accounts … manipulate user accounts, or falsely like, comment, forward, or navigate through web pages to implement traffic fraud or traffic hijacking …”

Eloy Sasot, group chief data and analytics officer at Richemont, the Switzerland-based luxury goods holding company, agrees that regulations are necessary. “And the regulations also should be managed with extreme care. When you add rules to an already complex system, there can be unintended consequences, both at the AI-solution level and the macro-economic level,” he says.

For instance, small companies, which have limited resources, may be less able to counter negative business impacts created by regulations targeting large companies. “So, in effect, regulations, if not carefully supervised, might result in a landscape that is less competitive and more monopolistic, with unintended consequences for end consumers whom the regulations were designed to protect,” he explains.

Technology Problem, or a People Problem?

Casey Fiesler is an assistant professor in the Department of Information Science at University of Colorado Boulder. She researches and teaches in the areas of technology ethics, internet law and policy, and online communities.

“I do not think that social media—or more broadly, online communities—are inherently harmful,” says Fiesler. “In fact, online communities have also done incredible good, especially in terms of social support and activism.”

But the harm caused by unfettered use of social media “often impacts marginalized and vulnerable users disproportionately,” she notes. Ethical social media platforms would consider those effects and work proactively to reduce or eliminate hate speech, trolling, defamation, cyber bullying, swatting, doxing, impersonation, and the intentional spread of false narratives.

“I consider myself an optimist who thinks that it is very important to think like a pessimist. And we should critique technology like social media because it has so much potential for good, and if we want to see those benefits, then we need to push for it to be better,” Fiesler says.

Ultimately, the future of ethical social media may depend more on the behaviors of people than on advances in technology.

“It’s not the medium that’s unethical—it’s the business people controlling it,” Dyson observes. “Talking about social media ethics is like talking about telephone ethics. It really depends on the people involved, not the platform.”

From Dyson’s point of view, the quest for ethical social media represents a fundamental challenge for society. “Are parents teaching their children to behave ethically? Are parents serving as role models for ethical behavior? We talk a lot about training AI, but are we training our children to think long-term, or just to seek short-term relief? Addiction is not about pleasure; it’s about relief from discomfort, from anxiety, from uncertainty, from a sense that we have no future,” she adds. “I personally think we’re just being blind to the consequences of short-term thinking. Silicon Valley is addicted to profits and exponential growth. But we need to start thinking about what we’re creating for the long term.”


Footnotes

  1. https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai
  2. ​​​​http://www.cac.gov.cn/2021-08/27/c_1631652502874117.htm
  3. https://about.fb.com/news/2021/09/research-teen-well-being-and-instagram/

2021 Data/AI Salary Survey [Radar]

In June 2021, we asked the recipients of our Data & AI Newsletter to respond to a survey about compensation. The results gave us insight into what our subscribers are paid, where they’re located, what industries they work for, what their concerns are, and what sorts of career development opportunities they’re pursuing.

While it’s sadly premature to say that the survey took place at the end of the COVID-19 pandemic (though we can all hope), it took place at a time when restrictions were loosening: we were starting to go out in public, have parties, and in some cases even attend in-person conferences. The results then provide a place to start thinking about what effect the pandemic had on employment. There was a lot of uncertainty about stability, particularly at smaller companies: Would the company’s business model continue to be effective? Would your job still be there in a year? At the same time, employees were reluctant to look for new jobs, especially if they would require relocating—at least according to the rumor mill. Were those concerns reflected in new patterns for employment?

Executive Summary

  • The average salary for data and AI professionals who responded to the survey was $146,000.
  • The average change in compensation over the last three years was $9,252. This corresponds to an annual increase of 2.25%. However, 8% of the correspondents reported decreased compensation, and 18% reported no change.
  • We don’t see evidence of a “great resignation.” 22% of respondents said they intended to change jobs, roughly what we would have expected. Respondents seemed concerned about job security, probably because of the pandemic’s effect on the economy.
  • Average compensation was highest in California ($176,000), followed by Eastern Seaboard states like New York and Massachusetts.
  • Compensation for women was significantly lower than for men (84%). Salaries were lower regardless of education or job title. Women were more likely than men to have advanced degrees, particularly PhDs.
  • Many respondents acquired certifications. Cloud certifications, specifically in AWS and Microsoft Azure, were most strongly associated with salary increases.
  • Most respondents participated in training of some form. Learning new skills and improving old ones were the most common reasons for training, though hireability and job security were also factors. Company-provided training opportunities were most strongly associated with pay increases.

Demographics

The survey was publicized through O’Reilly’s Data & AI Newsletter and was limited to respondents in the United States and the United Kingdom. There were 3,136 valid responses, 2,778 from the US and 284 from the UK. This report focuses on the respondents from the US, with only limited attention paid to those from the UK. A small number of respondents (74) identified as residents of the US or UK, but their IP addresses indicated that they were located elsewhere. We didn’t use the data from these respondents; in practice, discarding this data had no effect on the results.

Of the 2,778 US respondents, 2,225 (81%) identified as men, and 383 (14%) identified as women (as identified by their preferred pronouns). 113 (4%) identified as “other,” and 14 (0.5%) used “they.”

The results are biased by the survey’s recipients (subscribers to O’Reilly’s Data & AI Newsletter). Our audience is particularly strong in the software (20% of respondents), computer hardware (4%), and computer security (2%) industries—over 25% of the total. Our audience is also strong in the states where these industries are concentrated: 42% of the US respondents lived in California (20%), New York (9%), Massachusetts (6%), and Texas (7%), though these states only make up 27% of the US population.

Compensation Basics

The average annual salary for employees who worked in data or AI was $146,000. Most salaries were between $100,000 and $150,000 yearly (34%); the next most common salary tier was from $150,000 to $200,000 (26%). Compensation depended strongly on location, with average salaries highest in California ($176,000).

The average salary change over the past three years was $9,252, which is 2.25% per year (assuming a final salary equal to the average). A small number of respondents (8%) reported salary decreases, and 18% reported no change. Economic uncertainty caused by the pandemic may be responsible for the declines in compensation. 19% reported increases of $5,000 to $10,000 over that period; 14% reported increases of over $25,000. A study by the IEEE suggests that the average salary for technical employees increased 3.6% per year, higher than our respondents indicated.

39% of respondents reported promotions in the past three years, and 37% reported changing employers during that period. 22% reported that they were considering changing jobs because their salaries hadn’t increased during the past year. Is this a sign of what some have called a “great resignation”? Common wisdom has it that technical employees change jobs every three to four years. LinkedIn and Indeed both recommend staying for at least three years, though they observe that younger employees change jobs more often. LinkedIn elsewhere states that the annual turnover rate for technology employees is 13.2%—which suggests that employees stay at their jobs for roughly seven and a half years. If that’s correct, the 37% that changed jobs over three years seems about right, and the 22% who said they “intend to leave their job due to a lack of compensation increase” doesn’t seem overly high. Keep in mind that intent to change and actual change are not the same—and that there are many reasons to change jobs aside from salary, including flexibility around working hours and working from home.

64% of the respondents took part in training or obtained certifications in the past year, and 31% reported spending over 100 hours in training programs, ranging from formal graduate degrees to reading blog posts. As we’ll see later, cloud certifications (specifically in AWS and Microsoft Azure) were the most popular and appeared to have the largest effect on salaries.

The reasons respondents gave for participating in training were surprisingly consistent. The vast majority reported that they wanted to learn new skills (91%) or improve existing skills (84%). Data and AI professionals are clearly interested in learning—and that learning is self-motivated, not imposed by management. Relatively few (22%) said that training was required by their job, and even fewer participated in training because they were concerned about losing their job (9%).

However, there were other motives at work. 56% of our respondents said that they wanted to increase their “job security,” which is at odds with the low number who were concerned about losing their job. And 73% reported that they engaged in training or obtained certifications to increase their “hireability,” which may suggest more concern about job stability than our respondents would admit. The pandemic was a threat to many businesses, and employees were justifiably concerned that their job could vanish after a bad pandemic-influenced quarter. A desire for increased hireability may also indicate that we’ll see more people looking to change jobs in the near future.

Finally, 61% of the respondents said that they participated in training or earned certifications because they wanted a salary increase or a promotion (“increase in job title/responsibilities”). It isn’t surprising that employees see training as a route to promotion—especially as companies that want to hire in fields like data science, machine learning, and AI contend with a shortage of qualified employees. Given the difficulty of hiring expertise from outside, we expect an increasing number of companies to grow their own ML and AI talent internally using training programs.

Salaries by Gender

To nobody’s surprise, our survey showed that data science and AI professionals are mostly male. The number of respondents tells the story by itself: only 14% identified as women, which is lower than we’d have guessed, though it’s roughly consistent with our conference attendance (back when we had live conferences) and roughly equivalent to other technical fields. A small number (5%) reported their preferred pronoun as “they” or Other, but this sample was too small to draw any significant comparisons about compensation.

Women’s salaries were sharply lower than men’s salaries, averaging $126,000 annually, or 84% of the average salary for men ($150,000). That differential held regardless of education, as Figure 1 shows: the average salary for a woman with a doctorate or master’s degree was 82% of the salary for a man with an equivalent degree. The difference wasn’t quite as high for people with bachelor’s degrees or who were still students, but it was still significant: women with bachelor’s degrees or who were students earned 86% or 87% of the average salary for men. The difference in salaries was greatest between people who were self-taught: in that case, women’s salaries were 72% of men’s. An associate’s degree was the only degree for which women’s salaries were higher than men’s.

Figure 1. Women’s and men’s salaries by degree

Despite the salary differential, a higher percentage of women had advanced degrees than men: 16% of women had a doctorate, as opposed to 13% of men. And 47% of women had a master’s degree, as opposed to 46% of men. (If those percentages seem high, keep in mind that many professionals in data science and AI are escapees from academia.)

Women’s salaries also lagged men’s salaries when we compared women and men with similar job titles (see Figure 2). At the executive level, the average salary for women was $163,000 versus $205,000 for men (a 20% difference). At the director level, the difference was much smaller—$180,000 for women versus $184,000 for men—and women’s salaries were actually higher than those at the executive level. It’s easy to hypothesize about this difference, but we’re at a loss to explain it. For managers, women’s salaries were $143,000 versus $154,000 for men (a 7% difference).

Career advancement is also an issue: 18% of the women who participated in the survey were executives or directors, compared with 23% of the men.

Figure 2. Women’s and men’s salaries by job title

Before moving on from our consideration of the effect of gender on salary, let’s take a brief look at how salaries changed over the past three years. As Figure 3 shows, the percentage of men and women respondents who saw no change was virtually identical (18%). But more women than men saw their salaries decrease (10% versus 7%). Correspondingly, more men saw their salaries increase. Women were also more likely to have a smaller increase: 24% of women had an increase of under $5,000 versus 17% of men. At the high end of the salary spectrum, the difference between men and women was smaller, though still not zero: 19% of men saw their salaries increase by over $20,000, but only 18% of women did. So the most significant differences were in the midrange. One anomaly sticks out: a slightly higher percentage of women than men received salary increases in the $15,000 to $20,000 range (8% versus 6%).

Figure 3. Change in salary for women and men over three years

Salaries by Programming Language

When we looked at the most popular programming languages for data and AI practitioners, we didn’t see any surprises: Python was dominant (61%), followed by SQL (54%), JavaScript (32%), HTML (29%), Bash (29%), Java (24%), and R (20%). C++, C#, and C were further back in the list (12%, 12%, and 11%, respectively).

Discussing the connection between programming languages and salary is tricky because respondents were allowed to check multiple languages, and most did. But when we looked at the languages associated with the highest salaries, we got a significantly different list. The most widely used and popular languages, like Python ($150,000), SQL ($144,000), Java ($155,000), and JavaScript ($146,000), were solidly in the middle of the salary range. The outliers were Rust, which had the highest average salary (over $180,000), Go ($179,000), and Scala ($178,000). Other less common languages associated with high salaries were Erlang, Julia, Swift, and F#. Web languages (HTML, PHP, and CSS) were at the bottom (all around $135,000). See Figure 4 for the full list.

Figure 4. Salary vs. programming language

How do we explain this? It’s difficult to say that data and AI developers who use Rust command a higher salary, since most respondents checked several languages. But we believe that this data shows something significant. The supply of talent for newer languages like Rust and Go is relatively small. While there may not be a huge demand for data scientists who use these languages (yet), there’s clearly some demand—and with experienced Go and Rust programmers in short supply, they command a higher salary. Perhaps it is even simpler: regardless of the language someone will use at work, employers interpret knowledge of Rust and Go as a sign of competence and willingness to learn, which increases candidates’ value. A similar argument can be made for Scala, which is the native language for the widely used Spark platform. Languages like Python and SQL are table stakes: an applicant who can’t use them could easily be penalized, but competence doesn’t confer any special distinction.

One surprise is that 10% of the respondents said that they didn’t use any programming languages. We’re not sure what that means. It’s possible they worked entirely in Excel, which should be considered a programming language but often isn’t. It’s also possible that they were managers or executives who no longer did any programming.

Salaries by Tool and Platform

We also asked respondents what tools they used for statistics and machine learning and what platforms they used for data analytics and data management. We observed some of the same patterns that we saw with programming languages. And the same caution applies: respondents were allowed to select multiple answers to our questions about the tools and platforms that they use. (However, multiple answers weren’t as frequent as for programming languages.) In addition, if you’re familiar with tools and platforms for machine learning and statistics, you know that the boundary between them is fuzzy. Is Spark a tool or a platform? We considered it a platform, though two Spark libraries are in the list of tools. What about Kafka? A platform, clearly, but a platform for building data pipelines that’s qualitatively different from a platform like Ray, Spark, or Hadoop.

Just as with programming languages, we found that the most widely used tools and platforms were associated with midrange salaries; older tools, even if they’re still widely used, were associated with lower salaries; and some of the tools and platforms with the fewest users corresponded to the highest salaries. (See Figure 5 for the full list.)

The most common responses to the question about tools for machine learning or statistics were “I don’t use any tools” (40%) or Excel (31%). Ignoring the question of how one does machine learning or statistics without tools, we’ll only note that those who didn’t use tools had an average salary of $143,000, and Excel users had an average salary of $138,000—both below average. Stata ($120,000) was also at the bottom of the list; it’s an older package with relatively few users and is clearly falling out of favor.

The popular machine learning packages PyTorch (19% of users, $166,000 average salary), TensorFlow (20%, $164,000), and scikit-learn (27%, $157,000) occupied the middle ground. Those salaries were above the average for all respondents, which was pulled down by the large numbers who didn’t use tools or only used Excel. The highest salaries were associated with H2O (3%, $183,000), KNIME (2%, $180,000), Spark NLP (5%, $179,000), and Spark MLlib (8%, $175,000). It’s hard to trust conclusions based on 2% or 3% of the respondents, but it appears that salaries are higher for people who work with tools that have a lot of “buzz” but aren’t yet widely used. Employers pay a premium for specialized expertise.

Figure 5. Average salary by tools for statistics or machine learning

We see almost exactly the same thing when we look at data frameworks (Figure 6). Again, the most common response was from people who didn’t use a framework; that group also received the lowest salaries (30% of users, $133,000 average salary).

In 2021, Hadoop often seems like legacy software, but 15% of the respondents were working on the Hadoop platform, with an average salary of $166,000. That was above the average salary for all users and at the low end of the midrange for salaries sorted by platform.

The highest salaries were associated with Clicktale (now ContentSquare), a cloud-based analytics system for researching customer experience: only 0.2% of respondents use it, but they have an average salary of $225,000. Other frameworks associated with high salaries were Tecton (the commercial version of Michelangelo, at $218,000), Ray ($191,000), and Amundsen ($189,000). These frameworks had relatively few users—the most widely used in this group was Amundsen with 0.8% of respondents (and again, we caution against reading too much into results based on so few respondents). All of these platforms are relatively new, frequently discussed in the tech press and social media, and appear to be growing healthily. Kafka, Spark, Google BigQuery, and Dask were in the middle, with a lot of users (15%, 19%, 8%, and 5%) and above-average salaries ($179,000, $172,000, $170,000, and $170,000). Again, the most popular platforms occupied the middle of the range; experience with less frequently used and growing platforms commanded a premium.

Figure 6. Average salary by data framework or platform

Salaries by Industry

The greatest number of respondents worked in the software industry (20% of the total), followed by consulting (11%) and healthcare, banking, and education (each at 8%). Relatively few respondents listed themselves as consultants (also 2%), though consultancy tends to be cyclic, depending on current thinking on outsourcing, tax law, and other factors. The average income for consultants was $150,000, which is only slightly higher than the average for all respondents ($146,000). That may indicate that we’re currently in some kind of an equilibrium between consultants and in-house talent.

While data analysis has become essential to every kind of business and AI is finding many applications outside of computing, salaries were highest in the computer industry itself, as Figure 7 makes clear. For our purposes, the “computer industry” was divided into four segments: computer hardware, cloud services and hosting, security, and software. Average salaries in these industries ranged from $171,000 (for computer hardware) to $164,000 (for software). Salaries for the advertising industry (including social media) were surprisingly low, only $150,000.

Figure 7. Average salary by industry

Education and nonprofit organizations (including trade associations) were at the bottom end of the scale, with compensation just above $100,000 ($106,000 and $103,000, respectively). Salaries for technical workers in government were slightly higher ($124,000).

Salaries by State

When looking at data and AI practitioners geographically, there weren’t any big surprises. The states with the most respondents were California, New York, Texas, and Massachusetts. California accounted for 19% of the total, with over double the number of respondents from New York (8%). To understand how these four states dominate, remember that they make up 42% of our respondents but only 27% of the United States’ population.

Salaries in California were the highest, averaging $176,000. The Eastern Seaboard did well, with an average salary of $157,000 in Massachusetts (second highest). New York, Delaware, New Jersey, Maryland, and Washington, DC, all reported average salaries in the neighborhood of $150,000 (as did North Dakota, with five respondents). The average salary reported for Texas was $148,000, which is slightly above the national average but nevertheless seems on the low side for a state with a significant technology industry.

Salaries in the Pacific Northwest were not as high as we expected. Washington just barely made it into the top 10 in terms of the number of respondents, and average salaries in Washington and Oregon were $138,000 and $133,000, respectively. (See Figure 8 for the full list.)

The highest-paying jobs, with salaries over $300,000, were concentrated in California (5% of the state’s respondents) and Massachusetts (4%). There were a few interesting outliers: North Dakota and Nevada both had very few respondents, but each had one respondent making over $300,000. In Nevada, we’re guessing that’s someone who works for the casino industry—after all, the origins of probability and statistics are tied to gambling. Most states had no respondents with compensation over $300,000.

Figure 8. Average salary by state

The lowest salaries were, for the most part, from states with the fewest respondents. We’re reluctant to say more than that. These states typically had under 10 respondents, which means that averaging salaries is extremely noisy. For example, Alaska only had two respondents and an average salary of $75,000; Mississippi and Louisiana each only had five respondents, and Rhode Island only had three. In any of these states, one or two additional respondents at the executive level would have a huge effect on the states average. Furthermore, the averages in those states are so low that all (or almost all) respondents must be students, interns, or in entry-level positions. So we don’t think we can make any statement stronger than “the high paying jobs are where you’d expect them to be.”

Job Change by Salary

Despite the differences between states, we found that the desire to change jobs based on lack of compensation didn’t depend significantly on geography. There were outliers at both extremes, but they were all in states where the number of respondents was small and one or two people looking to change jobs would make a significant difference. It’s not terribly interesting to say that 24% of respondents from California intend to change jobs (only 2% above the national average); after all, you’d expect California to dominate. There may be a small signal from states like New York, with 232 respondents, of whom 27% intend to change jobs, or from a state like Virginia, with 137 respondents, of whom only 19% were thinking of changing. But again, these numbers aren’t much different from the total percentage of possible job changers.

If intent to change jobs due to compensation isn’t dependent on location, then what does it depend on? Salary. It’s not at all surprising that respondents with the lowest salaries (under $50,000/year) are highly motivated to change jobs (29%); this group is composed largely of students, interns, and others who are starting their careers. The group that showed the second highest desire to change jobs, however, had the highest salaries: over $400,000/year (27%). It’s an interesting pairing: those with the highest and lowest salaries were most intent on getting a salary increase.

26% of those with annual salaries between $50,000 and $100,000 indicated that they intend to change jobs because of compensation. For the remainder of the respondents (those with salaries between $100,000 and $400,000), the percentage who intend to change jobs was 22% or lower.

Salaries by Certification

Over a third of the respondents (37%) replied that they hadn’t obtained any certifications in the past year. The next biggest group replied “other” (14%), meaning that they had obtained certifications in the past year but not one of the certifications we listed. We allowed them to write in their own responses, and they shared 352 unique answers, ranging from vendor-specific certifications (e.g., DataRobot) to university degrees (e.g., University of Texas) to well-established certifications in any number of fields (e.g., Certified Information Systems Security Professional a.k.a. CISSP). While there were certainly cases where respondents used different words to describe the same thing, the amount of unique write-in responses reflects the great number of certifications available.

Cloud certifications were by far the most popular. The top certification was for AWS (3.9% obtained AWS Certified Solutions Architect-Associate), followed by Microsoft Azure (3.8% had AZ-900: Microsoft Azure Fundamentals), then two more AWS certifications and CompTIA’s Security+ certification (1% each). Keep in mind that 1% only represents 27 respondents, and all the other certifications had even fewer respondents.

As Figure 9 shows, the highest salaries were associated with AWS certifications, the Microsoft AZ-104 (Azure Administrator Associate) certification, and the CISSP security certification. The average salary for people listing these certifications was higher than the average salary for US respondents as a whole. And the average salary for respondents who wrote in a certification was slightly above the average for those who didn’t earn any certifications ($149,000 versus $143,000).

Figure 9. Average salary by certification earned

Certifications were also associated with salary increases (Figure 10). Again AWS and Microsoft Azure dominate, with Microsoft’s AZ-104 leading the way, followed by three AWS certifications. And on the whole, respondents with certifications appear to have received larger salary increases than those who didn’t earn any technical certifications.

Figure 10. Average salary change by certification

Google Cloud is an obvious omission from this story. While Google is the third-most-important cloud provider, only 26 respondents (roughly 1%) claimed any Google certification, all under the “Other” category.

Among our respondents, security certifications were relatively uncommon and didn’t appear to be associated with significantly higher salaries or salary increases. Cisco’s CCNP was associated with higher salary increases; respondents who earned the CompTIA Security+ or CISSP certifications received smaller increases. Does this reflect that management undervalues security training? If this hypothesis is correct, undervaluing security is clearly a significant mistake, given the ongoing importance of security and the possibility of new attacks against AI and other data-driven systems.

Cloud certifications clearly had the greatest effect on salary increases. With very few exceptions, any certification was better than no certification: respondents who wrote in a certification under “Other” averaged a $9,600 salary increase over the last few years, as opposed to $8,900 for respondents who didn’t obtain a certification and $9,300 for all respondents regardless of certification.

Training

Participating in training resulted in salary increases—but only for those who spent more than 100 hours in a training program. As Figure 11 shows, those respondents had an average salary increase of $11,000. This was also the largest group of respondents (19%). Respondents who only reported undertaking 1–19 hours of training (8%) saw lower salary increases, with an average of $7,100. It’s interesting that those who participated in 1–19 hours of training saw smaller increases than those who didn’t participate in training at all. It doesn’t make sense to speculate about this difference, but the data does make one thing clear: if you engage in training, be serious about it.

Figure 11. Average salary change vs. hours of training

We also asked what types of training respondents engaged in: whether it was company provided (for which there were three alternatives), a certification program, a conference, or some other kind of training (detailed in Figure 12). Respondents who took advantage of company-provided opportunities had the highest average salaries ($156,000, $150,000, and $149,000). Those who obtained certifications were next ($148,000). The results are similar if we look at salary increases over the past three years: Those who participated in various forms of company-offered training received increases between $11,000 and $10,000. Salary increases for respondents who obtained a certification were in the same range ($11,000).

Figure 12. Average salary change vs. type of training

The Last Word

Data and AI professionals—a rubric under which we include data scientists, data engineers, and specialists in AI and ML—are well-paid, reporting an average salary just under $150,000. However, there were sharp state-by-state differences: salaries were significantly higher in California, though the Northeast (with some exceptions) did well.

There were also significant differences between salaries for men and women. Men’s salaries were higher regardless of job title, regardless of training and regardless of academic degrees—even though women were more likely to have an advanced academic degree (PhD or master’s degree) than were men.

We don’t see evidence of a “great resignation.” Job turnover through the pandemic was roughly what we’d expect (perhaps slightly below normal). Respondents did appear to be concerned about job security, though they didn’t want to admit it explicitly. But with the exception of the least- and most-highly compensated respondents, the intent to change jobs because of salary was surprisingly consistent and nothing to be alarmed at.

Training was important, in part because it was associated with hireability and job security but more because respondents were genuinely interested in learning new skills and improving current ones. Cloud training, particularly in AWS and Microsoft Azure, was the most strongly associated with higher salary increases.

But perhaps we should leave the last word to our respondents. The final question in our survey asked what areas of technology would have the biggest effect on salary and promotions in the coming year. It wasn’t a surprise that most of the respondents said machine learning (63%)—these days, ML is the hottest topic in the data world. It was more of a surprise that “programming languages” was noted by just 34% of respondents. (Only “Other” received fewer responses—see Figure 13 for full details.) Our respondents clearly aren’t impressed by programming languages, even though the data suggests that employers are willing to pay a premium for Rust, Go, and Scala.

There’s another signal worth paying attention to if we look beyond the extremes. Data tools, cloud and containers, and automation were nearly tied (46, 47, and 44%). The cloud and containers category includes tools like Docker and Kubernetes, cloud providers like AWS and Microsoft Azure, and disciplines like MLOps. The tools category includes tools for building and maintaining data pipelines, like Kafka. “Automation” can mean a lot of things but in this context probably means automated training and deployment.

Figure 13. What technologies will have the biggest effect on compensation in the coming year?

We’ve argued for some time that operations—successfully deploying and managing applications in production—is the biggest issue facing ML practitioners in the coming years. If you want to stay on top of what’s happening in data, and if you want to maximize your job security, hireability, and salary, don’t just learn how to build AI models; learn how to deploy applications that live in the cloud.

In the classic movie The Graduate, one character famously says, “There’s a great future in plastics. Think about it.” In 2021, and without being anywhere near as repulsive, we’d say, “There’s a great future in the cloud. Think about it.”

Radar trends to watch: September 2021 [Radar]

Let’s start with a moment of silence for O’Reilly Author Toby Segaran, who passed away on August 11, 2021.  Toby was one of the people who got the Data Science movement started. His book, Programming Collective Intelligence, taught many how to start using their data. Throughout his career, he mentored many, and was particularly influential in mentoring young women interested in science and technology. Toby is greatly missed by everyone in the Data Science community.

AI and Data

  • Margaret Mitchell joins HuggingFace to create tools to help build fair algorithms.
  • Embedded Machine Learning for Hard Hat Detection is an interesting real-world application of AI on the edge. Wearing hard hats is essential to work site safety; this project developed a model for detecting whether workers were wearing hard hats that could easily be deployed without network connectivity. It also goes into rebalancing datasets–in this case, public datasets with too few hard hats, but this technique is applicable to other instances of bias.
  • Liquid Neural Networks are neural networks that can adapt in real time to incoming data.  They are particularly useful for time series data–which, as the author points out, is almost all data.
  • US Government agencies plan to increase their use of facial recognition, in many cases for law enforcement, despite well-known accuracy problems for minorities and women.  Local bans on face recognition cannot prohibit federal use.
  • Data and Politics is an ongoing research project that studies how political organizations are collecting and using data.
  • FaunaDB is a distributed document database designed for serverless architectures. It comes with REST API support, GraphQL, built-in attribute based access control, and a lot of other great features.
  • Facial expression recognition is being added to a future version of Android as part of their accessibility package. Developers can create applications where expressions (smiles, etc.) can be used as commands.
  • Open AI’s Codex (the technology behind Copilot) takes the next step: translating English into runnable code, rather than making suggestions.  Codex is now in private beta.
  • Who is responsible for publicly available datasets, and how do you ensure that they’re used appropriately? Margaret Mitchell suggests organizations for data stewardship. These would curate, maintain, and enforce legal standards for the use of public data.
  • An AI system can predict race accurately based purely on medical images, with no other information about the subject. This creates huge concerns about how bias could enter AI-driven diagnostics; but it also raises the possibility that we might discover better treatments for minorities who are underserved (or badly served) by the medical industry.
  • DeepMind has made progress in building a generalizable AI: AI agents that can solve problems that they have never seen before, and transfer learning from one problem to another. They have developed XLand, an environment that creates games and problems, to enable this research.
  • GPT-J is one of a number of open source alternatives to Github Copilot. It is smaller and faster, and appears to be at least as good.
  • Master faces” are images generated by adversarial neural networks that are capable of passing facial recognition tests without corresponding to any specific face.
  • Researchers have created a 3D map of a small part of a mouse’s brain. This is the most detailed map of how neurons connect that has ever been made.  The map contains 75,000 neurons and 523 million synapses; the map and the data set have been released to the public.

Robotics

  • Robotic chameleons (or chameleon robotics): Researchers have developed a robotic “skin” that can change color in real time to match its surroundings.
  • Elon Musk announces that Tesla will release a humanoid robot next year; it will be capable of performing tasks like going to the store. Is this real, or just a distraction from investigations into the safety of Tesla’s autonomous driving software?
  • According to the UN, lethal autonomous robots (robots capable of detecting and attacking a target without human intervention) have been deployed and used by the Libyan government.
  • A new generation of warehouse robots is capable of simple manipulation (picking up and boxing objects); robots capable of more fine-grained manipulation are coming.

Security

  • The end of passwords draws even closer. GitHub is now requiring 2-factor authentication, preferably using WebAuthn or Yubikey. Amazon will be giving free USB authentication keys to some customers (root account owners spending over $100/month).
  • There are many vulnerabilities in charging systems for electric vehicles. This is sad, but not surprising: the automotive industry hasn’t learned from the problems of IoT security.
  • Advances in cryptography may make it more efficient to do computation without decrypting encrypted data.
  • Amazon is offering store credit to people who give them their palm prints, for use in biometric checkout at their brick-and-mortar stores.
  • Amazon, Google, Microsoft, and others join the US Joint Cyber Defense Collaborative to fight the spread of ransomware.
  • Apple will be scanning iPhones for images of child abuse.  Child abuse aside, this decision raises questions about cryptographic backdoors for government agencies and Apple’s long-standing marketing of privacy. If they can monitor for one thing, they can monitor for others, and can presumably be legally forced to do so.
  • Automating incident response: self-healing auto-remediation could be the next step in automating all the things, building more reliable systems, and eliminating the 3AM pager.

Hardware

  • Hearables are very small computers, worn in the ear, for which the only interface is a microphone, a speaker, and a network. They may have applications in education, music, real time translation (like Babelfish), and of course, next-generation hearing aids.
  • Timekeeping is an old and well-recognized problem in distributed computing. Facebook’s Time cards are an open-source (code and hardware) solution for accurate time keeping. The cards are PCIe bus cards (PC standard) and incorporate a satellite receiver and an atomic clock.
  • A new cellular board for IoT from Ray Ozzie’s company Blues Wireless is a very interesting product. It is easy to program (JSON in and out), interfaces easily to Raspberry Pi and other systems, and $49 includes 10 years of cellular connectivity.

Social Media

  • Researchers are using Google Trends data to identify COVID symptoms as a proxy for hospital data, since hospital data isn’t publicly available. The key is distinguishing between flu-like flu symptoms and flu-like COVID symptoms.
  • A topic-based approach to targeted advertising may be Google’s new alternative to tracking cookies, replacing the idea of assigning users to cohorts with similar behavior.
  • Facebook shares a little information about what’s most widely viewed on their network. It only covers the top 20 URLs and, given Facebook’s attempts to shut down researchers studying their behavior, qualifies as transparency theater rather than substance.
  • As an experiment, Twitter is allowing certain users to mark misleading content.  They have not (and presumably won’t) specified how to become one of these users. The information they gain won’t be used directly for blocking misinformation, but to study how it propagates.
  • Banning as a service: It’s now possible to hire a company to get someone banned from Instagram and other social media. Not surprisingly, these organizations may be connected to organizations that specialize in restoring banned accounts.
  • Facebook may be researching ways to use some combination of AI and homomorphic encryption to place targeted ads on encrypted messages without decrypting them.
  • Inspired by the security community and bug bounties, Twitter offers a bounty to people who discover algorithmic bias.

Work

  • Facebook’s virtual reality workrooms could transform remote meetings by putting all the participants in a single VR conference room–assuming that all the participants are willing to wear goggles.
  • A survey shows that 70% of employees would prefer to work at home, even if it costs them in benefits, including vacation time and salaries.  Eliminating the commute adds up.

Cloud

  • Sky computing–the next step towards true utility computing–is essentially what we now call “multi cloud,” but with an inter-cloud layer that provides interoperability between cloud providers.
  • Thoughts on the future of the data stack as data starts to take advantage of cloud: how do organizations get beyond “lift and shift” and other early approaches to use clouds effectively?

Networks

  • Matrix is another protocol for decentralized messaging (similar in concept to Scuttlebutt) that appears to be getting some enterprise traction.
  • Using federated learning to build decentralized intelligent wireless communications systems that predict traffic patterns to help traffic management may be part of 6G.
  • How do you scale intelligence at the edge of the network? APIs, industrially hardened Linux systems, and Kubernetes adapted to small systems (e.g., K3S).

Miscellaneous

  • The EU is considering a law that would require cryptocurrency transactions to be traceable.  An EU-wide authority to prevent money laundering would have authority over cryptocurrencies.
  • Autocorrect errors in Excel are a problem in genomics: autocorrect modifies gene names, which are frequently “corrected” to dates.
  • Google may have created the first time crystals in a quantum computer. Time crystals are a theoretical construct that has a structure that constantly changes but repeats over time, without requiring additional energy.

Rebranding Data [Radar]

There’s a flavor of puzzle in which you try to determine the next number or shape in a sequence. We’re living that now, but for naming the data field.  “Predictive analytics.” “Big Data.” “Data science.” “Machine learning.” “AI.” What’s next?

It’s hard to say.  These terms all claim to be different, but they are very much the same.  They are supersets, subsets, and Venn diagrams with a lot of overlap.  Case in point: machine learning used to be considered part of data science; now it’s seen as a distinct (and superior) field.  What gives?

Since the promise of “analyzing data for fun and profit” has proven so successful, it’s odd that the field would feel the need to rebrand every couple of years.  You’d think that it would build on a single name, to drive home its transformative power.  Unless, maybe, it’s not all it claims to be?

Resetting the hype cycle

In a typical bubble—whether in the stock market, or the Dot-Com era—you see a large upswing and then a crash.  The upswing is businesses over-investing time, money, and effort in The New Thing. The crash happens when those same groups realize that The New Thing won’t ultimately help them, and they suddenly stop throwing money at it.

In finance terms, we’d say that the upswing represents a large and growing delta between the fundamental price (what The New Thing is actually worth) and the observed price (what people are spending on it, which is based on what they think it’s worth).  The ensuing crash represents a correction: a sharp, sudden reduction in that delta, as the observed price falls to something closer to the fundamental price.

Given that, we should have seen the initial Big Data hype bubble expand and then burst once businesses determined that this would only help a very small number of companies.  Big Data never crashed, though. Instead, we saw “data science” take off.  What’s weird is that companies were investing in roughly the same thing as before. It’s as though the rebranding was a way of laundering the data name, so that businesses and consumers could more easily forget that the previous version didn’t hold up to its claims.  This is the old “hair of the dog” hangover cure.

And it actually works.  Until it doesn’t.

Data success is not dead; it’s just unevenly distributed

This isn’t to say that data analysis has no value. The ability to explore massive amounts of data can be tremendously useful.  And lucrative.  Just not for everyone.

Too often, companies look to the FAANGs—Facebook, Amazon, Apple, Netflix, Google: the businesses that have clearly made a mint in data analysis—and figure they can copycat their way to the same success.  Reality’s harsh lesson is that it’s not so simple.  “Collect and analyze data” is just one ingredient of a successful data operation. You also need to connect those activities to your business model, and hand-waving over that part is only a temporary solution. At some point, you need to actually determine whether the fancy new thing can improve your business.  If not, it’s time to let it go.

We saw the same thing in the 1990s Dot-Com bust. The companies that genuinely needed developers and other in-house tech staff continued to need them; those that didn’t, well, they were able to save money by shedding jobs that weren’t providing business value.

Maybe data’s constant re-branding is the lesson learned from the 1990s? That if we keep re-branding, we can ride the misplaced optimism, and we’ll never hit that low point?

Why it matters

If the data world is able to sustain itself by simply changing its name every few years, what’s the big deal? Companies are making money, consumers are happy with claims of AI-driven products, and some people have managed to find very lucrative jobs.  Why worry about this now?

This quote from Cem Karsan, founder of Aegea Capital Management, sums it up well.  He’s talking about flows of money on Wall St. but the analogy applies just as well to the AI hype bubble:

If you’re on an airplane, and you’re 30,000 feet off the ground, that 30,000 feet off the ground is the valuation gap.  That’s where valuations are really high. But if those engines are firing, are you worried up in that plane about the valuations?  No!  You’re worried about the speed and trajectory of where you’re going, based on the engines.  […]  But, when all of the sudden, those engines go off, how far off the ground you are is all that matters.


—Cem Karsan, from Corey Hoffstein’s Flirting with Models podcast, S4E1 (2021/05/03), starting 37:30

Right now most of AI’s 30,000-foot altitude is hype. When the hype fades—when changing the name fails to keep the field aloft—that hype dissipates.  At that point you’ll have to sell based on what AI can really do, instead of a rosy, blurry picture of what might be possible.

This is when you might remind me of the old saying: “Make hay while the sun shines.”  I would agree, to a point.  So long as you’re able to cash out on the AI hype, even if that means renaming the field a few more times, go ahead.  But that’s a short-term plan.  Long-term survival in this game means knowing when that sun will set and planning accordingly.  How many more name-changes do we get?  How long before regulation and consumer privacy frustrations start to chip away at the façade?  How much longer will companies be able to paper over their AI-based systems’ mishaps?

Where to next?

If you’re building AI that’s all hype, then these questions may trouble you.  Post-bubble AI (or whatever we call it then) will be judged on meaningful characteristics and harsh realities: “Does this actually work?” and “Do the practitioners of this field create products and analyses that are genuinely useful?”  (For the investors in the crowd, this is akin to judging a company’s stock price on market fundamentals.)  Surviving long-term in this field will require that you find and build on realistic, worthwhile applications of AI.

Does our field need some time to sort that out?  I figure we have at least one more name change before we lose altitude.  We’ll need to use that time wisely, to become smarter about how we use and build around data.  We have to be ready to produce real value after the hype fades.

That’s easier said than done, but it’s far from impossible. We can start by shifting our focus to the basics, like reviewing our data and seeing whether it’s any good.  Accepting the uncomfortable truth that BI’s sums and groupings will help more businesses than AI’s neural networks. Evaluating the true total cost of AI, such that each six-figure data scientist salary is a proper business investment and not a very expensive lottery ticket.

We’ll also have to get better about folding AI into products (and understanding the risks in doing so), which will require building interdisciplinary, cognitively-diverse teams where everyone gets a chance to weigh in. Overall, then, we’ll have to educate ourselves and our customers on what data analysis can really achieve, and then plan our efforts accordingly.

We can do it. We’ll pretty much have to do it.  The question is: will we start before the plane loses altitude?

A Way Forward with Communal Computing [Radar]

Communal devices in our homes and offices aren’t quite right. In previous articles, we discussed the history of communal computing and the origin of the single user model. Then we reviewed the problems that arise due to identity, privacy, security, experience, and ownership issues. They aren’t solvable by just making a quick fix. They require a huge reorientation in how these devices are framed and designed.

This article focuses on modeling the communal device you want to build and understanding how it fits into the larger context. This includes how it interoperates with services that are connected, and how it communicates across boundaries with other devices in peoples’ homes. Ignore these warnings at your own peril. They can always unplug the device and recycle it.

Let’s first talk about how we gain an understanding of the environment inside homes and offices.

Mapping the communal space

We have seen a long list of problems that keep communal computing from aligning with people’s needs. This misalignment arises from the assumption that there is a single relationship between a person and a device, rather than between all the people involved and their devices.

Dr. S.A. Applin has referred to this assumption as “design individualism”; it is a common misframing used by technology organizations. She uses this term most recently in the paper “Facebook’s Project Aria indicates problems for responsible innovation when broadly deploying AR and other pervasive technology in the Commons:”

“Unfortunately, this is not an uncommon assumption in technology companies, but is a flaw in conceptual modelling that can cause great problems when products based on this ‘design individualism’ are deployed into the Commons (Applin, 2016b). In short, Facebook acknowledges the plural of ‘people’, but sees them as individuals collectively, not as a collective that is enmeshed, intertwined and exists based on multiple, multiplex, social, technological, and socio-technological relationships as described through [PolySocial Reality].”

PolySocial Reality (PoSR) is a theory described in a series of papers by Applin and Fisher (2010-ongoing) on the following:

“[PoSR] models the outcomes when all entities in networks send both synchronous and asynchronous messages to maintain social relationships. These messages can be human-to-human, human-to-machine, and machine-to-machine. PoSR contains the entirety of all messages at all times between all entities, and we can use this idea to understand how various factors in the outcomes from the way that messages are sent and received, can impact our ability to communicate, collaborate, and most importantly, cooperate with each other.”

In the case of PoSR, we need to consider how agents make decisions about the messages between entities. The designers of these non-human entities will make decisions that impact all entities in a system.

The reality is that the “self” only exists as part of a larger network. It is the connections between us and the rest of the network that is meaningful. We pull all of the pseudo identities for those various connections together to create our “one” self.

The model that I’ve found most helpful to address this problem attempts to describe the complete environment of the communal space. It culminates in a map of the connections between nodes, or relationships between entities. This web of interactions includes all the individuals, the devices they use, and the services that intermediate them. The key is to understand how non-human entities intermediate the humans, and how those messages eventually make it to human actors.



The home is a network, like an ecosystem, of people, devices, and services all interacting to create an experience. It is connected with services, people, and devices outside the home as well as my mom, my mom’s picture frame, and Google’s services that enable it.

To see why this map is helpful, consider an ecosystem (or food web). When we only consider interactions between individual animals, like a wolf eating a sheep, we ignore how the changes in population of each animal impacts other actors in the web: too many wolves mean the sheep population dies off. In turn, this change has an impact on other elements of the ecosystem like how much the grass grows. Likewise, when we only consider a single person interacting with one device, we find that most interactions are simple: some input from the user is followed by a response from the device. We often don’t consider other people interacting with the device, nor do we consider how other personal devices exist within that space. We start to see these interactions when we consider other people in the communal space, the new communal device, and all other personal devices. In a communal map, these all interact.

These ecosystems already exist within a home or office. They are made up of items ranging from refrigerator magnets for displaying physical pictures to a connected TV, and they include personal smartphones. The ecosystem extends to the services that the devices connect to outside the home, and to the other people whom they intermediate. We get an incomplete picture if we don’t consider the entire graph. Adding a new device isn’t about filling a specific gap in the ecosystem. The ecosystem may have many problems or challenges, but the ecosystem isn’t actively seeking to solve them. The new device needs to adapt and find its own niche. This includes making the ecosystem more beneficial to the device, something that evolutionary biologists call ‘niche expansion’. Technologists would think about this as building a need for their services.

Thinking about how a device creates a space within an already complex ecosystem is key to understanding what kinds of experiences the team building the device should create. It will help us do things like building for everyone and evolving with the space. It will also help us to avoid the things we should not do, like assuming that every device has to do everything.

Do’s and don’ts of building communal devices

With so much to consider when building communal devices, where do you start? Here are a few do’s and don’ts:

Do user research in the users’ own environment

Studying and understanding expectations and social norms is the key discovery task for building communal devices. Expectations and norms dictate the rules of the environment into which your device needs to fit, including people’s pseudo-identities, their expectations around privacy, and how willing they are to deal with the friction of added security. Just doing a survey isn’t enough.  Find people who are willing to let you see how they use these devices in their homes, and ask lots of questions about how they feel about the devices.

“If you are going to deal with social, people, communal, community, and general sociability, I would suggest hiring applied anthropologists and/or other social scientists on product teams. These experts will save you time and money, by providing you with more context and understanding of what you are making and its impact on others. This translates into more accurate and useful results.”

– Dr. S.A. Applin

Observing where the devices are placed and how the location’s use changes over time will give you fascinating insights about the context in which the device is used. A living room may be a children’s play area in the morning, a home office in the middle of the day, and a guest bedroom at night. People in these contexts have different sets of norms and privacy expectations.

As part of the user research, you should be building an ecosystem graph of all people present and the devices that they use. What people not present are intermediated by technology? Are there stories where this intermediation went wrong? Are there frictions that are created between people that your device should address? Are there frictions that the device should get out of the way of?

Do build for everyone who might have access

Don’t focus on the identity of the person who buys and sets up the device. You need to consider the identity (or lack) of everyone who could have access. Consider whether they feel that information collected about them violates their desire to control the information (as in Contextual Integrity). This could mean you need to put up walls to prevent users from doing something sensitive without authorization. Using the Zero Trust framework’s “trust engine” concept, you should ask for the appropriate level of authentication before proceeding.

Most of today’s user experience design is focused on making frictionless or seamless experiences. This goal doesn’t make sense when considering a risk tradeoff. In some cases, adding friction increases the chance that a user won’t move forward with a risky action, which could be a good thing. If the potential risk of showing a private picture is high, you should make it harder to show that picture.

Realize you may not always understand the right context. Having good and safe default states for those cases is important. It is your job to adjust or simplify the model so that people can understand and interpret why the device does something.

Do consider pseudo-identities for individuals and groups

Avoid singular identities and focus on group pseudo-identities. If users don’t consider these devices their own, why not have the setup experience mirror those expectations? Build device setup, usage, and management around everyone who should have a say in the device’s operation.

Pseudo-identities become very interesting when you start to learn what certain behaviors mean for subgroups. Is this music being played for an individual with particular tastes? Or does the choice reflect a compromise between multiple people in the room? Should it avoid explicit language since there are children present?

Group norms and relationships need to be made more understandable. It will take technology advances to make these norms more visible. These advances include using machine learning to help the device understand what kind of content it is showing, and who (or what) is depicted in that content. Text, image, and video analysis needs to take place to answer the question: what type of content is this and who is currently in that context? It also means using contextual prediction to consider who may be in the room, their relationship to the people in the content, and how they may feel about the content. When in doubt, restrict what you do.

Do evolve with the space

As time goes on, life events will change the environment in which the device operates. Try to detect those changes and adapt accordingly. New pseudo-identities could be present, or the identity representing the group may shift. It is like moving into a new home. You may set things up in one way only to find months later there is a better configuration. Be aware of these changes and adapt.

If behavior that would be considered anomalous becomes the norm, something may have changed about the use of that space. Changes in use are usually led by a change in life–for example, someone moving in or out could trigger a change in how a device is used. Unplugging the device and moving it to a different part of the room or to a different shelf symbolizes a new need for contextual understanding. If you detect a change in the environment but don’t know why the change was made, ask.

Do use behavioral data carefully, or don’t use it at all

All communal devices end up collecting data. For example, Spotify uses what you are listening to when building recommendation systems. When dealing with behavioral information, the group’s identity is important, not the individual’s. If you don’t know who is in front of the device, you should consider whether you can use that behavioral data at all. Rather than using an individual identity, you may want to default to the group pseudo-identity’s recommendations. What does the whole house usually like to listen to?



When the whole family is watching, how do we find a common ground based on all of our preferences, rather than the owner’s? Spotify has a Premium Family package where each person gets a recommended playlist based on everyone’s listening behavior called a Family Mix, whereas Netflix requires users to choose between individual profiles.

Spotify has family and couple accounts that allow multiple people to have an account under one bill. Each person gets their own login and recommendations. Spotify gives all sub-accounts on the subscription access to a shared playlist (like a Family Mix) that makes recommendations based on the group’s preferences.

Spotify, and services like it, should go a step further to reduce the weight of a song in their recommendations algorithm when it is being played on a shared device in a communal place–a kitchen, for example. It’s impossible to know everyone who is in a communal space. There’s a strong chance that a song played in a kitchen may not be preferred by anyone that lives there. To give that particular song a lot of weight will start to change recommendations on the group members’ personal devices.

If you can’t use behavioral data appropriately, don’t bring it into a user’s profile on your services. You should probably not collect it at all until you can handle the many people who could be using the device. Edge processing can allow a device to build context that respects the many people and their pseudo-identities that are at play in a communal environment. Sometimes it is just safer to not track.

Don’t assume that automation will work in all contexts

Prediction technology helps communal devices by finding behavior patterns. These patterns allow the device to calculate what content should be displayed and the potential trust. If a student always listens to music after school while doing homework, the device can assume that contextual integrity holds if the student is the only person there. These assumptions get problematic when part of the context is no longer understood, like when the student has other classmates over. That’s when violations of norms or of privacy expectations are likely to occur. If other people are around, different content is being requested, or if it is a different time of day, the device may not know enough to predict the correct information to display.

Amazon’s Alexa has started wading into these waters with their Hunches feature. If you say “good night” to Alexa, it can decide to turn off the lights. What happens if someone is quietly reading in the living room when the lights go out?  We’ve all accidentally turned the lights out on a friend or partner, but such mistakes quickly become more serious when they’re made by algorithm.

When the prediction algorithm’s confidence is low, it should disengage and try to learn the new behavior. Worst case, just ask the user what is appropriate and gauge the trust vs risk tradeoff accordingly. The more unexpected the context, the less likely it is that the system should presume anything. It should progressively restrict features until it is at its core: for home assistants, that may just mean displaying the current time.

Don’t include all service functionality on the device

All product teams consider what they should add next to make a device “fully functional” and reflect all of the service possibilities. For a communal device, you can’t just think about what you could put there; you also have to consider what you will never put there. An example could be allowing access to Gmail messages from a Google Home Hub. If it doesn’t make sense for most people to have access to some feature, it shouldn’t be there in the first place. It just creates clutter and makes the device harder to use. It is entirely appropriate to allow people to change personal preferences and deal with highly personal information on their own, private devices. There is a time and place for the appropriate content.

Amazon has considered whether Echo users should be allowed to complete a purchase, or limit them to just adding items to a shopping list. They have had to add four digit codes and voice profiles. The resulting interface is complex enough to warrant a top level help article on why people can’t make the purchases.

If you have already built too much, think about how to sunset certain features so that the value and differentiator of your device is clearer. Full access to personal data doesn’t work in the communal experience. It is a chance for some unknown privacy violation to occur.

Don’t assume your devices will be the only ones

Never assume that your company’s devices will be the only ones in the space. Even for large companies like Amazon, there is no future in which the refrigerator, oven, and TV will all be Amazon devices (even if they are trying really hard). The communal space is built up over a long time, and devices like refrigerators have lifetimes that can span decades.

Think about how your device might work alongside other devices, including personal devices. To do this, you need to integrate with network services (e.g. Google Calendar) or local device services (e.g. Amazon Ring video feed). This is the case for services within a communal space as well. People have different preferences for the services they use to communicate and entertain themselves. For example, Snapchat’s adoption by 13-24 year olds (~90% in the US market) accounts for 70% of its usage. This means that people over 24 years old are using very different services to interact with their family and peers.

Apple’s iOS has started to realize that apps need to ask for permission before collecting information from other devices on a local network. It verifies that an app is allowed to access other devices on the network. Local network access is not a foregone conclusion either: different routers and wifi access points are increasingly managed by network providers.

Communal device manufacturers must build for interoperability between devices whether they like it or not, taking into account industry standards for communicating state, messaging, and more. A device that isn’t networked with the other devices in the home is much more likely to be replaced when the single, non-networked use is no longer valid or current.

Don’t change the terms without an ‘out’ for owners

Bricking a device because someone doesn’t want to pay for a subscription or doesn’t like the new data use policy is bad. Not only will it create distrust in users but it violates the idea that they are purchasing something for their home.

When you need to change terms, allow owners to make a decision about whether they want new functionality or to stop getting updates. Not having an active subscription is no excuse for a device to fail, since devices should be able to work when a home’s WiFi is down or when AWS has a problem that stops a home’s light bulbs from working. Baseline functionality should always be available, even if leading edge features (for example, features using machine learning) require a subscription. “Smart” or not, there should be no such thing as a light bulb that can’t be turned on.

When a company can no longer support a device–either because they’re sunsetting it or, in the worst case, because they are going out of business–they should consider how to allow people to keep using their devices. In some cases, a motivated community can take on the support; this happened with the Jibo community when the device creator shut down.

Don’t require personal mobile apps to use the device

One bad limitation that I’ve seen is requiring an app to be installed on the purchaser’s phone, and requiring the purchaser to be logged in to use the device. Identity and security aren’t always necessary, and being too strict about identity tethers the device to a particular person’s phone.

The Philips Hue smart light bulbs are a way to turn any light fixture into a component in a smart lighting system. However, you need one of their branded apps to control the lightbulbs. If you integrate your lighting system with your Amazon or Google accounts, you still need to know what the bulb or “zone” of your house is called. As a host you end up having to take the action for someone else (say by yelling at your Echo for them) or put a piece of paper in the room with all of the instructions. We are back in the age of overly complicated instructions to turn on a TV and AV system.

In addition to making sure you can integrate with other touch and voice interfaces, you need to consider physical ways to allow anyone to interact. IoT power devices like the VeSync Smart Plug by Etekcity (I have a bunch around the house) have a physical button to allow manual switching, in addition to integrating with your smart home or using their branded apps. If you can’t operate the device manually if you are standing in front of it, is it really being built for everyone in the home?

How do you know you got this right?

Once you have implemented all of the recommendations, how do you know you are on the right track?

A simple way to figure out whether you are building a communal-friendly device is to look for people adding their profiles to the device. This means linking their accounts to other services like Spotify (if you allow that kind of linking). However, not everyone will want to or be able to add their accounts, especially people who are passing through (guests) or who cannot legally consent (children).

Using behavior to detect whether someone else is using the device can be difficult. While people don’t change their taste in music or other interests quickly, they slowly drift through the space of possible options. We seek things that are similar to what we like, but just different enough to be novel. In fact, we see that most of our music tastes are set in our teenage years. Therefore, if a communal device is asked to play songs in a different language or genre whereas a personal device does not, it’s more likely that someone new is listening than that the owner has suddenly learned a new language. Compare what users are doing on your device to their behavior on other platforms (for example, compare a Google Home Hub in the kitchen to a personal iPhone) to determine whether new users are accessing the platform.

Behavioral patterns can also be used to predict demographic information. For example, you may be able to predict that someone is a parent based on their usage patterns. If this confidence is high, and you only see their interests showing up in the behavioral data, that means that other people who are around the device are not using it.

Don’t forget that you can ask the users themselves about who is likely to use the device. This is information that you can collect during initial setup. This can help ensure you are not making incorrect assumptions about the placement and use of the device.

Finally, consider talking with customers about how they use the device, the issues that come up, and how it fits into their lives. Qualitative user research doesn’t end after the initial design phase. You need to be aware of how the device has changed the environment it fits into. Without social scientists you can’t know this.

Is everything a communal experience?

Up until this point we have been talking about devices that are part of the infrastructure of a home, like a smart screen or light switch. Once we realize that technology serves as an intermediary between people, everything is communal.

Inside of a home, roommates generally have to share expenses like utilities with each other. Companies like Klarna and Braid make finances communal. How you pay together is an important aspect to harmony within a home.

You are also part of communities in your neighborhoods. Amazon Sidewalk extends your devices into the neighborhood you live in. This mesh technology starts to map and extend further with each communal space. Where does your home’s communal space end? If you misplaced your keys a block away, a Tile could help you find them. It could also identify people in your neighborhood without considering your neighbors’ privacy expectations.

Communities aren’t just based on proximity. We can extend the household to connect with other households far away. Amazon’s Drop In has started their own calling network between households. Loop, a new startup, is focused on building a device for connecting families in their own social network.

Google/Alphabet’s Sidewalk Labs has taken on projects that aim to make the connected world part of the cityscape. An early project called LinkNYC (owned through a shell corporation) was digital signage that included free calling and USB hubs. This changed how homeless people used the built environment. When walking down the street you could see people’s smartphones dangling from a LinkNYC while they were panhandling nearby. Later, a district-wide project called Sidewalk Toronto withdrew their proposal rather than it being officially rejected by voters. Every object within the urban environment becomes something that not only collects data but that could be interactive.


The town square and public park has been built to be welcoming to people and set expectations of what they do there, unlike online social media. New Public is taking cues from this type of physical shared space for reimagining the online public square.

Taking cues from the real world, groups like New Public are asking what would happen if we built social media the same way we build public spaces. What if social media followed the norms that we have in social spaces like the public parks or squares?

A key aspect to communal computing is the natural limitations of physical and temporal use. Only so many people can fit inside a kitchen or a meeting room. Only so many people can use a device at once, even if it is a subway ticket machine that services millions of people per month. Only so many can fit onto a sidewalk. We need to consider the way that space and time play a part in these experiences.

Adapt or be unplugged

Rethinking how people use devices together inside our homes, offices, and other spaces is key to the future of ubiquitous computing. We have a long way to go in understanding how context changes the expectations and norms of the people in those spaces. Without updating how we design and build these devices, the device you build will just be one more addition to the landfill.

To understand how devices are used in these spaces, we need to expand our thinking beyond the single owner and design for communal use from the start. If we don’t, the devices will never fit properly into our shared and intimate spaces. The mismatch between expectations and what is delivered will grow greater and lead to more dire problems.

This is a call for change in how we consider devices integrated into our lives. We shouldn’t assume that because humans are adaptive, we can adapt to the technologies built. We should design the technologies to fit into our lives, making sure the devices understand the context in which they’re working.

The future of computing that is contextual, is communal.


Thanks

Thanks to Adam Thomas, Mark McCoy, Hugo Bowne-Anderson, and Danny Nou for their thoughts and edits on the early draft of this. Also, Dr. S.A. Applin for all of the great work on PoSR. Finally, from O’Reilly, Mike Loukides for being a great editor and Susan Thompson for the art.

Defending against ransomware is all about the basics [Radar]

The concept behind ransomware is simple. An attacker plants malware on your system that encrypts all the files, making your system useless, then offers to sell you the key you need to decrypt the files. Payment is usually in bitcoin (BTC), and the decryption key is deleted if you don’t pay within a certain period. Payments have typically been relatively small—though that’s obviously no longer true, with Colonial Pipeline’s multimillion-dollar payout.

Recently, ransomware attacks have been coupled with extortion: the malware sends valuable data (for example, a database of credit card numbers) back to the attacker, who then threatens to publish the data online if you don’t comply with the request.  

A survey on O’Reilly’s website1 showed that 6% of the respondents worked for organizations that were victims of ransomware attacks. How do you avoid joining them? We’ll have more to say about that, but the tl;dr is simple: pay attention to security basics. Strong passwords, two-factor authentication, defense in depth, staying on top of software updates, good backups, and the ability to restore from backups go a long way. Not only do they protect you from becoming a ransomware victim, but those basics can also help protect you from data theft, cryptojacking, and most other forms of cybercrime. The sad truth is that few organizations practice good security hygiene—and those that don’t end up paying the price.

But what about ransomware? Why is it such an issue, and how is it evolving? Historically, ransomware has been a relatively easy way to make money: set up operations in a country that’s not likely to investigate cybercrime, attack targets that are more likely to pay a ransom, keep the ransom small so it’s easier to pay than to restore from backup, and accept payment via some medium that’s perceived as anonymous. Like most things on the internet, ransomware’s advantage is scale: The WannaCry attack infected around 230,000 systems. If even a small percentage paid the US$300 ransom, that’s a lot of money.

Early on, attacks focused on small and midsize businesses, which often have limited IT staff and no professional security specialists. But more recently, hospitals, governments, and other organizations with valuable data have been attacked. A modern hospital can’t operate without patient data, so restoring systems is literally a matter of life and death. Most recently, we’ve seen attacks against large enterprises, like Colonial Pipeline. And this move toward bigger targets, with more valuable data, has been accompanied by larger ransoms.

Attackers have also gotten more sophisticated and specialized. They’ve set up help desks and customer service agents (much like any other company) to help customers make their payments and decrypt their data. Some criminal organizations offer “ransomware as a service,” running attacks for customers. Others develop the software or create the attacks that find victims. Initiating an attack doesn’t require any technical knowledge; it can all be contracted out, and the customer gets a nice dashboard to show the attack’s progress.

While it’s easy to believe (and probably correct) that government actors have gotten into the game, it’s important to keep in mind that attribution of an attack is very difficult—not least because of the number of actors involved. An “as a service” operator really doesn’t care who its clients are, and its clients may be (willingly) unaware of exactly what they’re buying. Plausible deniability is also a service.

How an attack begins

Ransomware attacks frequently start with phishing. An email to a victim entices them to open an attachment or to visit a website that installs malware. So the first thing you can do to prevent ransomware attacks is to make sure everyone is aware of phishing, very skeptical of any attachments they receive, and appropriately cautious about the websites they visit. Unfortunately, teaching people how to avoid being victimized by a phish is a battle you’re not likely to win. Phishes are getting increasingly sophisticated and now do a good job of impersonating people the victim knows. Spear phishing requires extensive research, and ransomware criminals have typically tried to compromise systems in bulk. But recently, we’ve been seeing attacks against more valuable victims. Larger, more valuable targets, with correspondingly bigger payouts, will merit the investment in research.

It’s also possible for an attack to start when a victim visits a legitimate but compromised website. In some cases, an attack can start without any action by the victim. Some ransomware (for example, WannaCry) can spread directly from computer to computer. One recent attack started through a supply chain compromise: attackers planted the ransomware in an enterprise security product, which was then distributed unwittingly to the product’s customers. Almost any vulnerability can be exploited to plant a ransomware payload on a victim’s device. Keeping browsers up-to-date helps to defend against compromised websites.

Most ransomware attacks begin on Windows systems or on mobile phones. This isn’t to imply that macOS, Linux, and other operating systems are less vulnerable; it’s just that other attack vectors are more common. We can guess at some reasons for this. Mobile phones move between different domains, as the owner goes from a coffee shop to home to the office, and are exposed to different networks with different risk factors. Although they are often used in risky territory, they’re rarely subject to the same device management that’s applied to “company” systems—but they’re often accorded the same level of trust. Therefore, it’s relatively easy for a phone to be compromised outside the office and then bring the attacker onto the corporate network when its owner returns to work.

It’s possible that Windows systems are common attack vectors just because there are so many of them, particularly in business environments. Many also believe that Windows users install updates less often than macOS and Linux users. Microsoft does a good job of patching vulnerabilities before they can be exploited, but that doesn’t do any good if updates aren’t installed. For example, Microsoft discovered and patched the vulnerability that WannaCry exploited well before the attacks began, but many individuals, and many companies, never installed the updates.

Preparations and precautions

The best defense against ransomware is to be prepared, starting with basic security hygiene. Frankly, this is true of any attack: get the basics right and you’ll have much less to worry about. If you’ve defended yourself against ransomware, you’ve done a lot to defend yourself against data theft, cryptojacking, and many other forms of cybercrime.

Security hygiene is simple in concept but hard in practice. It starts with passwords: Users must have nontrivial passwords. And they should never give their password to someone else, whether or not “someone else” is on staff (or claims to be).

Two-factor authentication (2FA), which requires something in addition to a password (for example, biometric authentication or a text message sent to a cell phone) is a must. Don’t just recommend 2FA; require it. Too many organizations buy and install the software but never require their staff to use it. (76% of the respondents to our survey said that their company used 2FA; 14% said they weren’t sure.)

Users should be aware of phishing and be extremely skeptical of email attachments that they weren’t expecting and websites that they didn’t plan to visit. It’s always a good practice to type URLs in yourself, rather than clicking on links in email—even those in messages that appear to be from friends or associates. Users should be aware of phishing and be extremely skeptical of email attachments that they weren’t expecting and websites that they didn’t plan to visit. It’s always a good practice to type URLs in yourself, rather than clicking on links in email—even those in messages that appear to be from friends or associates.

Backups are absolutely essential. But what’s even more important is the ability to restore from a backup. The easiest solution to ransomware is to reformat the disks and restore from backup. Unfortunately, few companies have good backups or the ability to restore from a backup—one security expert guesses that it’s as low as 10%. Here are a few key points:

  • You actually have to do the backups. (Many companies don’t.) Don’t rely solely on cloud storage; backup on physical drives that are disconnected when a backup isn’t in progress. (70% of our survey respondents said that their company performed backups regularly.)
  • You have to test the backups to ensure that you can restore the system. If you have a backup but can’t restore, you’re only pretending that you have a backup. (Only 48% of the respondents said that their company regularly practiced restoring from backups; 36% said they didn’t know.)
  • The backup device needs to be offline, connected only when a backup is in progress. Otherwise, it’s possible for the ransomware attack to encrypt your backup.

Don’t overlook testing your backups. Your business continuity planning should include ransomware scenarios: how do you continue doing business while systems are being restored? Chaos engineering, an approach developed at Netflix, is a good idea. Make a practice of breaking your storage capability, then restoring it from backup. Do this monthly—if possible, schedule it with the product and project management teams. Testing the ability to restore your production systems isn’t just about proving that everything works; it’s about training staff to react calmly in a crisis and resolve the outage efficiently. When something goes bad, you don’t want to be on Stack Overflow asking how to do a restore. You want that knowledge imprinted in everyone’s brains.

Keep operating systems and browsers up-to-date. Too many have become victims because of a vulnerability that was patched in a software update that they didn’t install. (79% of our survey respondents said that their company had processes for updating critical software, including browsers.)

An important principle in any kind of security is “least privilege.” No person or system should be authorized to do anything it doesn’t need to do. For example, no one outside of HR should have access to the employee database. “Of course,” you say—but that includes the CEO. No one outside of sales should have access to the customer database. And so on. Least privilege works for software too. Services need access to other services—but services must authenticate to each other and should only be able to make requests appropriate to their role. Any unexpected request should be rejected and treated as a signal that the software has been compromised. And least privilege works for hardware, whether virtual or physical: finance systems and servers shouldn’t be able to access HR systems, for example. Ideally, they should be on separate networks. You should have a “defense in depth” security strategy that focuses not only on keeping “bad guys” out of your network but also on limiting where they can go once they’re inside. You want to stop an attack that originates on HR systems from finding its way to the finance systems or some other part of the company. Particularly when you’re dealing with ransomware, making it difficult for an attack to propagate from one system to another is all-important.

Attribute-based access control (ABAC) can be seen as an extension of least privilege. ABAC is based on defining policies about exactly who and what should be allowed to access every service: What are the criteria on which trust should be based? And how do these criteria change over time? If a device suddenly moves between networks, does that represent a risk? If a system suddenly makes a request that it has never made before, has it been compromised? At what point should access to services be denied? ABAC, done right, is difficult and requires a lot of human involvement: looking at logs, deciding what kinds of access are appropriate, and keeping policies up-to-date as the situation changes. Working from home is an example of a major change that security people will need to take into account. You might have “trusted” an employee’s laptop, but should you trust it when it’s on the same network as their children? Some of this can be automated, but the bottom line is that you can’t automate security.

Finally: detecting a ransomware attack isn’t difficult. If you think about it, this makes a lot of sense: encrypting all your files requires a lot of CPU and filesystem activity, and that’s a red flag. The way files change is also a giveaway. Most unencrypted files have low entropy: they have a high degree of order. (On the simplest level, you can glance at a text file and tell that it’s text. That’s because it has a certain kind of order. Other kinds of files are also ordered, though the order isn’t as apparent to a human.) Encrypted files have high entropy (i.e., they’re very disordered)—they have to be; otherwise, they’d be easy to decrypt. Computing a file’s entropy is simple and for these purposes doesn’t require looking at the entire file. Many security products for desktop and laptop systems are capable of detecting and stopping a ransomware attack. We don’t do product recommendations, but we do recommend that you research the products that are available. (PC Magazine’s 2021 review of ransomware detection products is a good place to start.)

In the data center or the cloud

Detecting ransomware once it has escaped into a data center, whether in the cloud or on-premises, isn’t a fundamentally different task, but commercial products aren’t there yet. Again, prevention is the best defense, and the best defense is strong on the fundamentals. Ransomware makes its way from a desktop to a data center via compromised credentials and operating systems that are unpatched and unprotected. We can’t say this too often: make sure secrets are protected, make sure identity and access management are configured correctly, make sure you have a backup strategy (and that the backups work), and make sure operating systems are patched—zero-trust is your friend.

Amazon Web Services, Microsoft Azure, and Google Cloud all have services named “Identity and Access Management” (IAM); the fact that they all converged on the same name tells you something about how important it is. These are the services that configure users, roles, and privileges, and they’re the key to protecting your cloud assets. IAM doesn’t have a reputation for being easy. Nevertheless, it’s something you have to get right; misconfigured IAM is at the root of many cloud vulnerabilities. One report claims that well over 50% of the organizations using Google Cloud were running workloads with administrator privileges. While that report singles out Google, we believe that the same is true at other cloud providers. All of these workloads are at risk; administrator privileges should only be used for essential management tasks. Google Cloud, AWS, Azure, and the other providers give you the tools you need to secure your workloads, but they can’t force you to use them correctly.

It’s worth asking your cloud vendor some hard questions. Specifically, what kind of support can your vendor give you if you are a victim of a security breach? What can your vendor do if you lose control of your applications because IAM has been misconfigured? What can your vendor do to restore your data if you succumb to ransomware? Don’t assume that everything in the cloud is “backed up” just because it’s in the cloud. AWS and Azure offer backup services; Google Cloud offers backup services for SQL databases but doesn’t appear to offer anything comprehensive. Whatever your solution, don’t just assume it works. Make sure that your backups can’t be accessed via the normal paths for accessing your services—that’s the cloud version of “leave your physical backup drives disconnected when not in use.” You don’t want an attacker to find your cloud backups and encrypt them too. And finally, test your backups and practice restoring your data.

Any frameworks your IT group has in place for observability will be a big help: Abnormal file activity is always suspicious. Databases that suddenly change in unexpected ways are suspicious. So are services (whether “micro” or “macroscopic”) that suddenly start to fail. If you have built observability into your systems, you’re at least partway there.

How confident are you that you can defend against a ransomware attack? In our survey, 60% of the respondents said that they were confident; another 28% said “maybe,” and 12% said “no.” We’d give our respondents good, but not great, marks on readiness (2FA, software updates, and backups). And we’d caution that confidence is good but overconfidence can be fatal. Make sure that your defenses are in place and that those defenses work.

If you become a victim

What do you do? Many organizations just pay. (Ransomwhe.re tracks total payments to ransomware sites, currently estimated at $92,120,383.83.) The FBI says that you shouldn’t pay, but if you don’t have the ability to restore your systems from backups, you might not have an alternative. Although the FBI was able to recover the ransom paid by Colonial Pipeline, I don’t think there’s any case in which they’ve been able to recover decryption keys.

Whether paying the ransom is a good option depends on how much you trust the cybercriminals responsible for the attack. The common wisdom is that ransomware attackers are trustworthy, that they’ll give you the key you need to decrypt your data and even help you use it correctly. If the word gets out that they can’t be trusted to restore your systems, they’ll find fewer victims willing to pay up. However, at least one security vendor says that 40% of ransomware victims who pay never get their files restored. That’s a very big “however,” and a very big risk—especially as ransomware demands skyrocket. Criminals are, after all, criminals. It’s all the more reason to have good backups.

There’s another reason not to pay that may be more important. Ransomware is a big business, and like any business, it will continue to exist as long as it’s profitable. Paying your attackers might be an easy solution short-term, but you’re just setting up the next victim. We need to protect each other, and the best way to do that is to make ransomware less profitable.

Another problem that victims face is extortion. If the attackers steal your data in addition to encrypting it, they can demand money not to publish your confidential data online—which may leave you with substantial penalties for exposing private data under laws such as GDPR and CCPA. This secondary attack is becoming increasingly common.

Whether or not they pay, ransomware victims frequently face revictimization because they never fix the vulnerability that allowed the ransomware in the first place. So they pay the ransom, and a few months later, they’re attacked again, using the same vulnerability. The attack may come from the same people or it may come from someone else. Like any other business, an attacker wants to maximize its profits, and that might mean selling the information they used to compromise your systems to other ransomware outfits. If you become a victim, take that as a very serious warning. Don’t think that the story is over when you’ve restored your systems.

Here’s the bottom line, whether or not you pay. If you become a victim of ransomware, figure out how the ransomware got in and plug those holes. We began this article by talking about basic security practices. Keep your software up-to-date. Use two-factor authentication. Implement defense in depth wherever possible. Design zero-trust into your applications. And above all, get serious about backups and practice restoring from backup regularly. You don’t want to become a victim again.


Thanks to John Viega, Dean Bushmiller, Ronald Eddings, and Matthew Kirk for their help. Any errors or misunderstandings are, of course, mine.


Footnote

  1. The survey ran July 21, 2021, through July 23, 2021, and received more than 700 responses.

Radar trends to watch: August 2021 [Radar]

Security continues to be in the news: most notably the Kaseya ransomware attack, which was the first case of a supply chain ransomware attack that we’re aware of. That’s new and very dangerous territory. However, the biggest problem in security remains simple: take care of the basics. Good practices for authentication, backups, and software updates are the best defense against ransomware and many other attacks.

Facebook has said that it is now focusing on building the virtual reality Metaverse, which will be the successor to the web. To succeed, VR will have to get beyond ultra geeky goggles. But Google Glass showed the way, and that path is being followed by Apple and Facebook in their product development.

AI and Data

  • There’s a new technique for protecting natural language systems from attack by misinformation and malware bots: using honeypots to capture attackers’ key phrases proactively, and incorporate defenses into the training process.
  • DeepMind’s AlphaFold has made major breakthroughs in protein folding. DeepMind has released the source code for AlphaCode 2.0 on GitHub. DeepMind will also release the structure of every known protein. The database currently includes over 350,000 protein structures, but is expected to grow to over 100,000,000. This is of immense importance to research in biology and medicine.
  • Google searches can now tell you why a given result was included. It’s a minor change, but we’ve long argued that in AI, “why” may give you more information than “what.”
  • Researchers have been able to synthesize speech using the brainwaves of a patient who has been paralyzed and unable to talk. The process combines brain wave detection with models that predict the next word.
  • The National Institute of Standards (NIST) tests systems for identifying airline passengers for flight boarding.  They claim that they have achieved 99.87% accuracy, without significant differences in performance between different demographic groups.
  • An attempt at adding imagination to AI works has been made by combining different attributes of known objects. Humans are good at this: we can imagine a green dog, for example.
  • Phase precession is a recently discovered phenomenon by which neurons encode information in the timing of their firing.  It may relate to humans’ ability to learn on the basis of a small number of examples.
  • Yoshua Bengio, Geoff Hinton, and Yann LeCun give an assessment of the state of Deep Learning, its future, and its ability to solve problems.
  • AI is learning to predict human behavior from videos (e.g., movies). This research attempts to answer the question “What will someone do next?” in situations where there are large uncertainties. One trick is reverting to high-level concepts (e.g., “greet”) when the system can’t predict more specific behaviors (e.g., “shake hands”).

Programming

  • JAX is a new Python library for high-performance mathematics. It includes a just-in-time compiler, support for GPUs and TPUs, automatic differentiation, and automatic vectorization and parallelization.
  • Matrix is an open standard for a decentralized “conversation store” that is used as the background for many other kinds of applications. Germany has announced that it will use Matrix as the standard for digital messaging in its national electronic health records system.
  • Brython is Python 3.9.5 running in the browser, with access to the DOM.  It’s not a replacement for JavaScript, but there are a lot of clever things you can do with it.
  • Using a terminal well has always been a superpower. Warp is a new terminal emulator built in Rust with features that you’d never expect: command sharing, long-term cloud-based history, a true text editor, and a lot more.
  • Is it WebAssembly’s time? Probably not yet, but it’s coming. Krustlets allow you to run WebAssembly workloads under Kubernetes. There is also an alternative to a filesystem written in wasm; JupyterLite is an attempt to build a complete distribution of Jupyter, including JupyterLab, that runs entirely in the browser.

Robotics

  • Google launches Intrinsic, a moonshot project to develop industrial robots.
  • 21st Century Problems: should autonomous delivery robots be allowed in bike lanes? The Austin (Texas) City Council already has to consider this issue.

Materials

  • Veins in materials? Researchers have greatly reduced the time it takes to build vascular systems into materials, which could have an important impact on our ability to build self-healing structures.
  • Researchers have designed fabrics that can cool the body by up to 5 degrees Celsius by absorbing heat and re-emitting it in the near-infrared range.

Hardware

  • A bendable processor from ARM could be the future of wearable computing. It’s far from a state-of-the-art CPU, and probably will never be one, but with further development could be useful in edge applications that require flexibility.
  • Google experiments with error correction for quantum computers.  Developing error correction is a necessary step towards making quantum computers “real.”

Security

  • Attackers have learned to scan repos like GitHub to find private keys and other credentials that have inadvertently been left in code that has been checked in. Checkov, a code analysis tool for detecting vulnerabilities in cloud infrastructure, can now can find these credentials in code.
  • Amnesty International has released an open source tool for checking whether a phone has been compromised by Pegasus, the spyware sold by the NSO group to many governments, and used (among other things) to track journalists. Matthew Green’s perspective on “security nihilism” discusses the NSO’s activity; it is a must-read.
  • The REvil ransomware gang (among other things, responsible for the Kaseya attack, which infected over 1,000 businesses) has disappeared; all of its web sites went down at the same time. Nobody knows why; possibilities include pressure from law enforcement, reorganization, and even retirement.
  • DID is a new proposed form of decentralized digital identity that is currently being tested in the travel passports with COVID data being developed by the International Air Transport Association.
  • A massive ransomware attack by the REvil cybercrime group exploited supply chain vulnerabilities. The payload was implanted in a security product by Kaseya that is used to automate software installation and updates. The attack apparently only affects on-premises infrastructure. Victims are worldwide; the number of victims is in the “low thousands.”
  • Kubernetes is being used by the FancyBear cybercrime group, and other groups associated with the Russian government, to orchestrate a worldwide wave of brute-force attacks aimed at data theft and credential stealing.

Operations

  • Observability is the next step beyond monitoring.  That applies to data and machine learning, too, and is part of incorporating ML into production processes.
  • A new load balancing algorithm does a much better job of managing load at datacenters, and reduces power consumption by allowing servers to be shut down when not in use.
  • MicroK8S is a version of Kubernetes designed for small clusters that claims to be fault tolerant and self-healing, requiring little administration.
  • Calico is a Kubernetes plugin that simplifies network configuration. 

Web and Mobile

  • Scuttlebutt is a protocol for the decentralized web that’s “a way out of the social media rat race.”  It’s (by definition) “sometimes on,” not a constant presence.
  • Storywrangler is a tool for analyzing Twitter at scale.  It picks out the most popular word combinations in a large number of languages.
  • Google is adding support for “COVID vaccination passports” to Android devices.
  • Tim Berners-Lee’s Solid protocol appears to be getting real, with a small ecosystem of pod providers (online data stores) and apps.
  • Why are Apple and Google interested in autonomous vehicles? What’s the business model? They are after the last few minutes of attention. If you aren’t driving, you’ll be in an app.

Virtual Reality

  • Mark Zuckerberg has been talking up the Metaverse as the next stage in the Internet’s evolution: a replacement for the Web as an AR/VR world. But who will want to live in Facebook’s world?
  • Facebook is committing to the OpenXR standard for its Virtual Reality products. In August 2022, all new applications will be required to use OpenXR; its proprietary APIs will be deprecated.

Miscellaneous

  • The Open Voice Network is an industry association organized by the Linux Foundation that is dedicated to ethics in voice-driven applications. Their goal is to close the “trust gap” in voice applications.

Communal Computing’s Many Problems [Radar]

In the first article of this series, we discussed communal computing devices and the problems they create–or, more precisely, the problems that arise because we don’t really understand what “communal” means. Communal devices are intended to be used by groups of people in homes and offices. Examples include popular home assistants and smart displays like the Amazon Echo, Google Home, Apple HomePod, and many others.  If we don’t create these devices with communities of people in mind, we will continue to build the wrong ones.

Ever since the concept of a “user” was invented (which was probably later than you think), we’ve assumed that devices are “owned” by a single user. Someone buys the device and sets up the account; it’s their device, their account.  When we’re building shared devices with a user model, that model quickly runs into limitations. What happens when you want your home assistant to play music for a dinner party, but your preferences have been skewed by your children’s listening habits? We, as users, have certain expectations for what a device should do. But we, as technologists, have typically ignored our own expectations when designing and building those devices.

This expectation isn’t a new one either. The telephone in the kitchen was for everyone’s use. After the release of the iPad in 2010 Craig Hockenberry discussed the great value of communal computing but also the concerns:

“When you pass it around, you’re giving everyone who touches it the opportunity to mess with your private life, whether intentionally or not. That makes me uneasy.”

Communal computing requires a new mindset that takes into account users’ expectations. If the devices aren’t designed with those expectations in mind, they’re destined for the landfill. Users will eventually experience “weirdness” and “annoyance” that grows to distrust of the device itself. As technologists, we often call these weirdnesses “edge cases.” That’s precisely where we’re wrong: they’re not edge cases, but they’re at the core of how people want to use these devices.

In the first article, we listed five core questions we should ask about communal devices:

  1. Identity: Do we know all of the people who are using the device?
  2. Privacy: Are we exposing (or hiding) the right content for all of the people with access?
  3. Security: Are we allowing all of the people using the device to do or see what they should and are we protecting the content from people that shouldn’t?
  4. Experience: What is the contextually appropriate display or next action?
  5. Ownership: Who owns all of the data and services attached to the device that multiple people are using?

In this article, we’ll take a deeper look at these questions, to see how the problems manifest and how to understand them.

Identity

All of the problems we’ve listed start with the idea that there is one registered and known person who should use the device. That model doesn’t fit reality: the identity of a communal device isn’t a single person, but everyone who can interact with it. This could be anyone able to tap the screen, make a voice command, use a remote, or simply be sensed by it. To understand this communal model and the problems it poses, start with the person who buys and sets up the device. It is associated with that individual’s account, like a personal Amazon account with its order history and shopping list. Then it gets difficult. Who doesn’t, can’t, or shouldn’t have full access to an Amazon account? Do you want everyone who comes into your house to be able to add something to your shopping list?

If you think about the spectrum of people who could be in your house, they range from people whom you trust, to people who you don’t really trust but who should be there, to those who you  shouldn’t trust at all.


There is a spectrum of trust for people who have access to communal devices

In addition to individuals, we need to consider the groups that each person could be part of. These group memberships are called “pseudo-identities”; they are facets of a person’s full identity. They are usually defined by how the person associated themself with a group of other people. My life at work, home, a high school friends group, and as a sports fan show different parts of my identity. When I’m with other people who share the same pseudo-identity, we can share information. When there are people from one group in front of a device I may avoid showing content that is associated with another group (or another personal pseudo-identity). This can sound abstract, but it isn’t; if you’re with friends in a sports bar, you probably want notifications about the teams you follow. You probably don’t want news about work, unless it’s an emergency.

There are important reasons why we show a particular facet of our identity in a particular context. When designing an experience, you need to consider the identity context and where the experience will take place. Most recently this has come up with work from home. Many people talk about ‘bringing your whole self to work,’ but don’t realize that “your whole self” isn’t always appropriate. Remote work changes when and where I should interact with work. For a smart screen in my kitchen, it is appropriate to have content that is related to my home and family. Is it appropriate to have all of my work notifications and meetings there? Could it be a problem for children to have the ability to join my work calls? What does my IT group require as far as security of work devices versus personal home devices?

With these devices we may need to switch to a different pseudo-identity to get something done. I may need to be reminded of a work meeting. When I get a notification from a close friend, I need to decide whether it is appropriate to respond based on the other people around me.

The pandemic has broken down the barriers between home and work. The natural context switch from being at work and worrying about work things and then going home to worry about home things is no longer the case. People need to make a conscious effort to “turn off work” and to change the context. Just because it is the middle of the workday doesn’t always mean I want to be bothered by work. I may want to change contexts to take a break. Such context shifts add nuance to the way the current pseudo-identity should be considered, and to the overarching context you need to detect.

Next, we need to consider identities as groups that I belong to. I’m part of my family, and my family would potentially want to talk with other families. I live in a house that is on my street alongside other neighbors. I’m part of an organization that I identify as my work. These are all pseudo-identities we should consider, based on where the device is placed and in relation to other equally important identities.

The crux of the problem with communal devices is the multiple identities that are or may be using the device. This requires greater understanding of who, where, and why people are using the device. We need to consider the types of groups that are part of the home and office.

Privacy

As we consider the identities of all people with access to the device, and the identity of the place the device is to be part of, we start to consider what privacy expectations people may have given the context in which the device is used.

Privacy is hard to understand. The framework I’ve found most helpful is Contextual Integrity which was introduced by Helen Nissenbaum in the book Privacy in Context. Contextual Integrity describes four key aspects of privacy:

  1. Privacy is provided by appropriate flows of information.
  2. Appropriate information flows are those that conform to contextual information norms.
  3. Contextual informational norms refer to five independent parameters: data subject, sender, recipient, information type, and transmission principle.
  4. Conceptions of privacy are based on ethical concerns that evolve over time.

What is most important about Contextual Integrity is that privacy is not about hiding information away from the public but giving people a way to control the flow of their own information. The context in which information is shared determines what is appropriate.

This flow either feels appropriate, or not, based on key characteristics of the information (from Wikipedia):

  1. The data subject: Who or what is this about?
  2. The sender of the data: Who is sending it?
  3. The recipient of the data: Who will eventually see or get the data?
  4. The information type: What type of information is this (e.g. a photo, text)?
  5. The transmission principle: In what set of norms is this being shared (e.g. school, medical, personal communication)?

We rarely acknowledge how a subtle change in one of these parameters could be a violation of privacy. It may be completely acceptable for my friend to have a weird photo of me, but once it gets posted on a company intranet site it violates how I want information (a photo) to flow. The recipient of the data has changed to something I no longer find acceptable. But I might not care whether a complete stranger (like a burglar) sees the photo, as long as it never gets back to someone I know.

For communal use cases, the sender or receiver of information is often a group. There may be  multiple people in the room during a video call, not just the person you are calling. People can walk in and out. I might be happy with some people in my home seeing a particular photo, but find it embarrassing if it is shown to guests at a dinner party.

We must also consider what happens when other people’s content is shown to those who shouldn’t see it. This content could be photos or notifications from people outside the communal space that could be seen by anyone in front of the device. Smartphones can hide message contents when you aren’t near your phone for this exact reason.

The services themselves can expand the ‘receivers’ of information in ways that create uncomfortable situations. In Privacy in Context, Nissenbaum talks about the privacy implications of Google Street View when it places photos of people’s houses on Google Maps. When a house was only visible to people who walked down the street that was one thing, but when anyone in the world can access a picture of a house, that changes the parameters in a way that causes concern. Most recently, IBM used Flickr photos that were shared under a Creative Commons license to train facial recognition algorithms. While this didn’t require any change to terms of the service it was a surprise to people and may be in violation of the Creative Commons license. In the end, IBM took the dataset down.

Privacy considerations for communal devices should focus on who is gaining access to information and whether it is appropriate based on people’s expectations. Without using a framework like contextual inquiry we will be stuck talking about generalized rules for data sharing, and there will always be edge cases that violate someone’s privacy.


A note about children

Children make identity and privacy especially tricky. About 40% of all households have a child. Children shouldn’t be an afterthought. If you aren’t compliant with local laws you can get in a lot of trouble. In 2019, YouTube had to settle with the FTC for a $170 million fine for selling ads targeting children. It gets complicated because the ‘age of consent’ depends on the region as well: COPPA in the US is for people under 13 years old, CCPA in California is for people under 16, and GDPR overall is under 16 years old but each member state can set its own. The moment you acknowledge children are using your platforms, you need to accommodate them.

For communal devices, there are many use cases for children. Once they realize they can play whatever music they want (including tracks of fart sounds) on a shared device they will do it. Children focus on the exploration over the task and will end up discovering way more about the device than parents might. Adjusting your practices after building a device is a recipe for failure. You will find that the paradigms you choose for other parties won’t align with the expectations for children, and modifying your software to accommodate children is difficult or impossible. It’s important to account for children from the beginning.


Security

To get to a home assistant, you usually need to pass through a home’s outer door. There is usually a physical limitation by way of a lock. There may be alarm systems. Finally, there are social norms: you don’t just walk into someone else’s house without knocking or being invited.

Once you are past all of these locks, alarms, and norms, anyone can access the communal device. Few things within a home are restricted–possibly a safe with important documents. When a communal device requires authentication, it is usually subverted in some way for convenience: for example, a password might be taped to it, or a password may never have been set.

The concept of Zero Trust Networks speaks to this problem. It comes down to a key question: is the risk associated with an action greater than the trust we have that the person performing the action is who they say they are?


Source: https://learning.oreilly.com/library/view/zero-trust-networks/9781491962183/

Passwords, passcodes, or mobile device authentication become nuisances; these supposed secrets are frequently shared between everyone who has access to the device. Passwords might be written down for people who can’t remember them, making them visible to less trusted people visiting your household. Have we not learned anything since the movie War Games?

When we consider the risk associated with an action, we need to understand its privacy implications. Would the action expose someone’s information without their knowledge? Would it allow a person to pretend to be someone else? Could another party tell easily the device was being used by an imposter?

There is a tradeoff between the trust and risk. The device needs to calculate whether we know who the person is and whether the person wants the information to be shown. That needs to be weighed against the potential risk or harm if an inappropriate person is in front of the device.


Having someone in your home accidentally share embarrassing photos could have social implications.

A few examples of this tradeoff:

Feature Risk and trust calculation Possible issues
Showing a photo when the device detects someone in the room Photo content sensitivity, who is in the room  Showing an inappropriate photo to a complete stranger
Starting a video call Person’s account being used for the call, the actual person starting the call When the other side picks up it may not be who they thought it would be
Playing a personal song playlist Personal recommendations being impacted Incorrect future recommendations
Automatically ordering something based on a voice command Convenience of ordering, approval of the shopping account’s owner Shipping an item that shouldn’t have been ordered

This gets even trickier when people no longer in the home can access the devices remotely. There have been cases of harassment, intimidation, and domestic abuse by people whose access should have been revoked: for example, an ex-partner turning off the heating system. When should someone be able to access communal devices remotely? When should their access be controllable from the devices themselves? How should people be reminded to update their access control lists? How does basic security maintenance happen inside a communal space?

See how much work this takes in a recent account of pro bono security work for a harassed mother and her son. Or how a YouTuber was blackmailed, surveilled, and harassed by her smart home. Apple even has a manual for this type of situation.

At home, where there’s no corporate IT group to create policies and automation to keep things secure, it’s next to impossible to manage all of these security issues. Even some corporations have trouble with it. We need to figure out how users will maintain and configure a communal device over time. Configuration for devices in the home and office can be wrought with lots of different types of needs over time.

For example, what happens when someone leaves the home and is no longer part of it? We will need to remove their access and may even find it necessary to block them from certain services. This is highlighted with the cases of harassment of people through spouses that still control the communal devices. Ongoing maintenance of a particular device could also be triggered by a change in needs by the community. A home device may be used to just play music or check the weather at first. But when a new baby comes home, being able to do video calling with close relatives may become a higher priority.

End users are usually very bad at changing configuration after it is set. They may not even know that they can configure something in the first place. This is why people have made a business out of setting up home stereo and video systems. People just don’t understand the technologies they are putting in their houses. Does that mean we need some type of handy-person that does home device setup and management? When more complicated routines are required to meet the needs, how does someone allow for changes without writing code, if they are allowed to?

Communal devices need new paradigms of security that go beyond the standard login. The world inside a home is protected by a barrier like a locked door; the capabilities of communal devices should respect that. This means both removing friction in some cases and increasing it in others.


A note about biometrics

 “Turn your face” to enroll in Google Face Match and personalize your devices.
(Source: Google Face Match video, https://youtu.be/ODy_xJHW6CI?t=26)

Biometric authentication for voice and face recognition can help us get a better understanding of who is using a device. Examples of biometric authentication include FaceID for the iPhone and voice profiles for Amazon Alexa. There is a push for regulation of facial recognition technologies, but opt-in for authentication purposes tends to be carved out.

However, biometrics aren’t without problems. In addition to issues with skin tone, gender bias, and local accents, biometrics assumes that everyone is willing to have a biometric profile on the device–and that they would be legally allowed to (for example, children may not be allowed to consent to a biometric profile). It also assumes this technology is secure. Google FaceMatch makes it very clear it is only a technology for personalization, rather than authentication. I can only guess they have legalese to avoid liability when an unauthorized person spoofs someone’s face, say by taking a photo off the wall and showing it to the device.

What do we mean by “personalization?” When you walk into a room and FaceMatch identifies your face, the Google Home Hub dings, shows your face icon, then shows your calendar (if it is connected), and a feed of personalized cards. Apple’s FaceID uses many levels of presentation attack detection (also known as “anti-spoofing”): it verifies your eyes are open and you are looking at the screen, and it uses a depth sensor to make sure it isn’t “seeing” a photo. The phone can then show hidden notification content or open the phone to the home screen. This measurement of trust and risk is benefited by understanding who could be in front of the device. We can’t forget that the machine learning that is doing biometrics is not a deterministic calculation; there is always some degree of uncertainty.

Social and information norms define what we consider acceptable, who we trust, and how much. As trust goes up, we can take more risks in the way we handle information. However, it’s difficult to connect trust with risk without understanding people’s expectations. I have access to my partner’s iPhone and know the passcode. It would be a violation of a norm if I walked over and unlocked it without being asked, and doing so will lead to reduced trust between us.

As we can see, biometrics does offer some benefits but won’t be the panacea for the unique uses of communal devices. Biometrics will allow those willing to opt-in to the collection of their biometric profile to gain personalized access with low friction, but it will never be useable for everyone with physical access.


Experiences

People use a communal device for short experiences (checking the weather), ambient experiences (listening to music or glancing at a photo), and joint experiences (multiple people watching a movie). The device needs to be aware of norms within the space and between the multiple people in the space. Social norms are rules by which people decide how to act in a particular context or space. In the home, there are norms about what people should and should not do. If you are a guest, you try to see if people take their shoes off at the door; you don’t rearrange things on a bookshelf; and so on.

Most software is built to work for as many people as possible; this is called generalization. Norms stand in the way of generalization. Today’s technology isn’t good enough to adapt to every possible situation. One strategy is to simplify the software’s functionality and let the humans enforce norms. For example, when multiple people talk to an Echo at the same time, Alexa will either not understand or it will take action on the last command. Multi-turn conversations between multiple people are still in their infancy. This is fine when there are understood norms–for example, between my partner and I. But it doesn’t work so well when you and a child are both trying to shout commands.


Shared experiences can be challenging like a parent and child yelling at an Amazon Echo to play what they want.

Norms are interesting because they tend to be learned and negotiated over time, but are invisible. Experiences that are built for communal use need to be aware of these invisible norms through cues that can be detected from peoples’ actions and words. This gets especially tricky because a conversation between two people could include information subject to different expectations (in a Contextual Integrity sense) about how that information is used. With enough data, models can be created to “read between the lines” in both helpful and dangerous ways.

Video games already cater to multiple people’s experiences. With the Nintendo Switch or any other gaming system, several people can play together in a joint experience. However, the rules governing these experiences are never applied to, say, Netflix. The assumption is always that one person holds the remote. How might these experiences be improved if software could accept input from multiple sources (remote controls, voice, etc.) to build a selection of movies that is appropriate for everyone watching?

Communal experience problems highlight inequalities in households. With women doing more household coordination than ever, there is a need to rebalance the tasks for households. Most of the time these coordination tasks are relegated to personal devices, generally the wife’s mobile phone, when they involve the entire family (though there is a digital divide outside the US). Without moving these experiences into a place that everyone can participate in, we will continue these inequalities.

So far, technology has been great at intermediating people for coordination through systems like text messaging, social networks, and collaborative documents. We don’t build interaction paradigms that allow for multiple people to engage at the same time in their communal spaces. To do this we need to address that the norms that dictate what is appropriate behavior are invisible and pervasive in the spaces these technologies are deployed.

Ownership

Many of these devices are not really owned by the people who buy them. As part of the current trend towards subscription-based business models, the device won’t function if you don’t subscribe to a service. Those services have license agreements that specify what you can and cannot do (which you can read if you have a few hours to spare and can understand them).

For example, this has been an issue for fans of Amazon’s Blink camera. The home automation industry is fragmented: there are many vendors, each with its own application to control their particular devices. But most people don’t want to use different apps to control their lighting, their television, their security cameras, and their locks. Therefore, people have started to build controllers that span the different ecosystems. Doing so has caused Blink users to get their accounts suspended.

What’s even worse is that these license agreements can change whenever the company wants. Licenses are frequently modified with nothing more than a notification, after which something that was previously acceptable is now forbidden. In 2020, Wink suddenly applied a monthly service charge; if you didn’t pay, the device would stop working. Also in 2020, Sonos caused a stir by saying they were going to “recycle” (disable) old devices. They eventually changed their policy.

The issue isn’t just what you can do with your devices; it’s also what happens to the data they create. Amazon’s Ring partnership with one in ten US police departments troubles many privacy groups because it creates a vast surveillance program. What if you don’t want to be a part of the police state? Make sure you check the right box and read your terms of service. If you’re designing a device, you need to require users to opt in to data sharing (especially as regions adapt GDPR and CCPA-like regulation).

While techniques like federated learning are on the horizon, to avoid latency issues and mass data collection, it remains to be seen whether those techniques are satisfactory for companies that collect data. Is there a benefit to both organizations and their customers to limit or obfuscate the transmission of data away from the device?

Ownership is particularly tricky for communal devices. This is a collision between the expectations of consumers who put something in their home; those expectations run directly against the way rent-to-use services are pitched. Until we acknowledge that hardware put in a home is different from a cloud service, we will never get it right.

Lots of problems, now what?

Now that we have dived into the various problems that rear their head with communal devices, what do we do about it? In the next article we discuss a way to consider the map of the communal space. This helps build a better understanding of how the communal device fits in the context of the space and services that exist already.

We will also provide a list of dos and don’ts for leaders, developers, and designers to consider when building a communal device.

Thinking About Glue [Radar]

In Glue: the Dark Matter of Software, Marcel Weiher asks why there’s so much code. Why is Microsoft Office 400 million lines of code? Why are we always running into the truth of Alan Kay’s statement that “Software seems ‘large’ and ‘complicated’ for what it does”?

Weiher makes an interesting claim: the reason we have so much code is Glue Code, the code that connects everything together. It’s “invisible and massive”; it’s “deemed not important”; and, perhaps most important, it’s “quadratic”: the glue code is proportional to the square of the number of things you need to glue. That feels right; and in the past few years, we’ve become increasingly aware of the skyrocketing number of dependencies in any software project significantly more complex than “Hello, World!” We can all add our own examples: the classic article Hidden Technical Debt in Machine Learning Systems shows a block diagram of a system in which machine learning is a tiny block in the middle, surrounded by all sorts of infrastructure: data pipelines, resource management, configuration, etc. Object Relational Management (ORM) frameworks are a kind of glue between application software and databases. Web frameworks facilitate gluing together components of various types, along with gluing that front end to some kind of back end. The list goes on.

Weiher makes another important point: the simplest abstraction for glue is the Unix pipe (|), although he points out that pipes are not the only solution. Anyone who has used Unix or a variant (and certainly anyone who has read–or in my case, written–chunks of Unix Power Tools) realizes how powerful the pipe is. A standard way to connect tools that are designed to do one thing well: that’s important.

But there’s another side to this problem, and one that we often sweep under the rug. A pipe has two ends: something that’s sending data, and something that’s receiving it. The sender needs to send data in a format that the receiver understands, or (more likely) the receiver needs to be able to parse and interpret the sender’s data in a way that it understands. You can pipe all the log data you want into an awk script (or perl, or python), but that script is still going to have to parse that data to make it interpretable. That’s really what those millions of lines of glue code do: either format data so the receiver can understand it or parse incoming data into a usable form. (This task falls more often on the receiver than the sender, largely because the sender often doesn’t—and shouldn’t—know anything about the receiver.)

From this standpoint, the real problem with glue isn’t moving data, though the Unix pipe is a great abstraction; it’s data integration. In a discussion about blockchains and medical records, Jim Stogdill once said “the real problem has nothing to do with blockchains. The real problem is data integration.” You can put all the data you want on a blockchain, or in a data warehouse, or in a subsurface data ocean the size of one of Jupiter’s moons, and you won’t solve the problem that application A generates data in a form that application B can’t use. If you know anything about medical records (and I know very little), you know that’s the heart of the problem. One major vendor has products that aren’t even compatible with each other, let alone competitors’ systems. Not only are data formats incompatible, the meanings of fields in the data are often different in subtle ways. Chasing down those differences can easily run to hundreds of thousands, if not millions, of lines of code.

Pipes are great for moving data from one place to another. But there’s no equivalent standard for data integration. XML might play a role, but it only solves the easy part of the problem: standardizing parsing has some value, but the ease of parsing XML was always oversold, and the real problems stem more from schemas than data formats. (And please don’t play the “XML is human-readable and -writable” game.) JSON strikes me as XML for “pickling” JavaScript objects, replacing angle brackets with curly braces: a good idea that has gotten a lot of cross-language support, but like XML neglects the tough part of the problem.

Is data integration a problem that can be solved? In networking, we have standards for what data means and how to send it. All those TCP/IP packet headers that have been in use for almost 40 years (the first deployment of IPv4 was in 1982) have kept data flowing between systems built by different vendors. The fields in the header have been defined precisely, and new protocols have been built successfully at every layer of the network stack.

But this kind of standardization doesn’t solve the N squared problem. In a network stack, TCP talks to TCP; HTTPS talks to HTTPS. (Arguably, it keeps the N squared problem from being an N cubed problem.) The network stack designs the N squared problem out of existence, at least as far as the network itself is concerned, but that doesn’t help at the application layer. When we’re talking applications, a medical app needs to understand medical records, financial records, regulatory constraints, insurance records, reporting systems, and probably dozens more. Nor does standardization really solve the problem of new services. IPv4 desperately needs to be replaced (and IPv6 has been around since 1995), but IPv6 has been “5 years in the future” for two decades now. Hack on top of hack has kept IPv4 workable; but will layer and layer of hack work if we’re extending medical or financial applications?

Glue code expands as the square of the number of things that are glued. The need to glue different systems together is at the core of the problems facing software development; as systems become more all-encompassing, the need to integrate with different systems increases. The glue–which includes code written for data integration–becomes its own kind of technical debt, adding to the maintenance burden. It’s rarely (if ever) refactored or just plain removed because you always need to “maintain compatibility” with some old system.  (Remember IE6?)

Is there a solution? In the future, we’ll probably need to integrate more services.  The glue code will be more complex, since it will probably need to live in some “zero trust” framework (another issue, but an important one).  Still, knowing that you’re writing glue code, keeping track of where it is, and being proactive about removing it when it’s needed will keep the problem manageable. Designing interfaces carefully and observing standards will minimize the need for glue. In the final analysis, is glue code really a problem? Programming is ultimately about gluing things together, whether they’re microservices or programming libraries. Glue isn’t some kind of computational waste; it’s what holds our systems together.  Glue development is software development.

Radar trends to watch: July 2021 [Radar]

Certainly the biggest news of the past month has been a continuation of the trend towards regulating the biggest players in the tech industry.  The US House of Representatives is considering 5 antitrust bills that would lead to major changes in the way the largest technology companies do business; and the Biden administration has appointed a new Chair of the Federal Trade Commission who will be inclined to use these regulations aggressively. Whether these bills pass in their current form, how they are challenged in court, and what changes they will lead to is an open question.  (Late note: Antitrust cases against Facebook by the FTC and state governments based on current law were just thrown out of court.)

Aside from that, we see AI spreading into almost every area of computing; this list could easily have a single AI heading that subsumes programming, medicine, security, and everything else.

AI and Data

  • A new algorithm allows autonomous vehicles to locate themselves using computer vision (i.e., without GPS) regardless of the season; it works even when the terrain is snow-covered.
  • An AI-based wildfire detection system has been deployed in Sonoma County. It looks for smoke plumes, and can monitor many more cameras than a human.
  • Researchers are investigating how racism and other forms of abuse enter AI models like GPT-3, and what can be done to prevent their appearance in the output. It’s essential for AI to “understand” racist content, but equally essential for it not to generate that content.
  • Google has successfully used Reinforcement Learning to design the layout for the next generation TPU chip. The layout process took 6 hours, and replaced weeks of human effort. This is an important breakthrough in the design of custom integrated circuits.
  • Facebook has developed technology to identify the source from which deepfake images originate. “Fingerprints” (distortions in the image) make it possible to identify the model that generated the images, and possibly to track down the creators.
  • Adaptive mood control is a technique that autonomous vehicles can use to detect passengers’ emotions and drive accordingly, making it easier for humans to trust the machine. We hope this doesn’t lead AVs to drive faster when the passenger is angry or frustrated.
  • IBM has developed Uncertainty Quantification 360, a set of open source tools for quantifying the uncertainty in AI systems. Understanding uncertainty is a big step towards building trustworthy AI and getting beyond the idea that the computer is always right. Trust requires understanding uncertainty.
  • Waymo’s autonomous trucks will begin carrying real cargo between Houston and Fort Worth, in a partnership with a major trucking company.
  • GPT-2 can predict brain activity and comprehension in fMRI studies of patients listening to stories, possibly indicating that in some way its processes correlate to brain function.
  • GPT-J is a language model with performance similar to GPT-3.  The code and weights are open source.
  • It appears possible to predict preferences directly by comparing brain activity to activity of others (essentially, brain-based collaborative filtering). A tool for advertising or for self-knowledge?
  • Features stores are tools to automate building pipelines to deliver data for ML applications in production. Tecton, which originated with Uber’s Michelangelo, is one of the early commercial products available.
  • How does machine learning work with language? Everything You Ever Said doesn’t answer the question, but lets you play with an NLP engine by pasting in a text, then adding or subtracting concepts to see how the text is transformed.  (Based on GLoVE, a pre-GPT model.)
  • The HateCheck dataset tests the ability of AI applications to detect hate speech correctly. Hate speech is a hard problem; being too strict causes systems to reject content that shouldn’t be classified as hate speech, while being too lax allows hate speech through.

Ethics

  • Twitter has built a data ethics group aimed at putting ethics into practice, in addition to research.  Among others, the group includes Rumman Chowdhury and Kristian Lum.
  • A study of the effect of noise on fairness in lending shows that insufficient (hence noisier) data is as big a problem as biased data. Poor people have less credit history, which means that their credit scores are often inaccurate. Correcting problems arising from noise is much more difficult than dealing with problems of bias.
  • Andrew Ng’s newsletter, The Batch, reports on a survey of executives that most companies are not practicing “responsible AI,” or even understand the issues. There is no consensus about the importance (or even the meaning) of “ethics” for AI.
  • Using AI to screen resumes is a problem in itself, but AI doing the interview? That’s taking problematic to a new level. It can be argued that AI, when done properly, is less subject to bias than a human interviewer, but we suspect that AI interviewers present more problems than solutions.

Web

  • WebGPU is a proposal for a standard API that makes GPUs directly accessible to web pages for rendering and computation.
  • An end to providing cookie consent for every site you visit?  The proposed ADPC (advanced data protection control) standard will allow users to specify privacy preferences once.
  • Using social media community guidelines as a political weapon: the Atajurt Kazakh Human Rights channel, which publishes testimonies from people imprisoned in China’s internment camps, has been taken down repeatedly as a result of coordinated campaigns.

Security

  • Microsoft is working on eliminating passwords! Other companies should take the hint. Microsoft is stressing biometrics (which have their own problems) and multi-factor authentication.
  • Supply chain security is very problematic.  Microsoft admits to an error in which they mistakenly signed a device driver that was actually a rootkit, causing security software to ignore it. The malware somehow slipped through Microsoft’s signing process.
  • Markpainting is a technology for defeating attempts to create a fake image by adding elements to the picture that aren’t visible, but that will become visible when the image is modified (for example, to eliminate a watermark).
  • Amazon Sidewalk lets Amazon devices connect to other open WiFi nets to extend their range and tap others’ internet connections. Sidewalk is a cool take on decentralized networking. It is also a Very Bad Idea.
  • Authentication using gestures, hand shapes, and geometric deep learning? I’m not convinced, but this could be a viable alternative to passwords and crude biometrics. It would have to work for people of all skin colors, and that has consistently been a problem for vision-based products.
  • According to Google, Rowhammer attacks are gaining momentum–and will certainly gain even more momentum as feature sizes in memory chips get smaller. Rowhammer attacks repeatedly access a single row in a memory chip, hoping to corrupt adjacent bits.
  • While details are sketchy, the FBI was able to recover the BTC Colonial Pipeline paid to Darkside to restore systems after their ransomware attack. The FBI has been careful to say that they can’t promise recovering payments in other cases. Whether this recovery reflects poor opsec on the part of the criminals, or that Bitcoin is more easily de-anonymized than most people think, it’s clear that secrecy and privacy are relative.

Design and User Experience

  • Communal Computing is about designing devices that are inherently shared: home assistants, home automation, and more. The “single account/user” model doesn’t work.
  • A microphone that only “hears” frequencies above the human hearing range can be used to detect human activities (for example, in a smart home device) without recording speech.
  • Digital Twins in aerospace at scale: One problem with the adoption of digital twins is that the twin is very specific to a single device. This research shows that it’s possible to model real-world objects in ways that can be reused across collections of objects and different applications.

Medicine

  • The Open Insulin Foundation is dedicated to creating the tools necessary to produce insulin at scale. This is the next step in a long-term project by Anthony DiFranco and others to challenge the pharma company’s monopoly on insulin production, and create products at a small fraction of the price.
  • Where’s the work on antivirals and other treatments for COVID-19? The answer is simple: Vaccines are very profitable. Antivirals aren’t. This is a huge, institutional problem in the pharmaceutical industry.
  • The National Covid Cohort Collaborative (N3C) is a nationwide database of anonymized medical records of COVID patients. What’s significant isn’t COVID, but that N3C is a single database, built to comply with privacy laws, that’s auditable, and that’s open for any group to make research proposals.
  • Can medical trials be sped up by re-using control data (data from patients who were in the control group) from previous trials? Particularly for rare and life-threatening diseases, getting trial volunteers is difficult because nobody wants to be assigned to the control group.
  • A remote monitoring patch for COVID patients uses AI to understand changes in the patient’s vital signs, allowing medical staff to intervene immediately if a patient’s condition worsens. Unlike most such devices, it was trained primarily on Black and Hispanic patients.
  • Machine learning in medicine is undergoing a credibility crisis: poor data sets with limited diversity lead to poor results.

Programming

  • Microsoft, OpenAI, and GitHub have announced a new service called Copilot that uses AI to make suggestions to programmers as they are writing code (currently in “technical preview”).  It is truly a cybernetic pair programmer.
  • Windows 11 will run Android apps. If nothing else, this is a surprise. Android apps will be provided via the Amazon store, not Google Play.
  • Microsoft’s PowerFx is a low-code programming language based on Excel formulas (which now include lambdas).  Input and output are through what looks like a web page. What does it mean to strip Excel from its 2D grid? Is this a step forward or backward for low code computing?
  • Open Source Insights is a Google project for investigating the dependency chain of any open source project. Its ability currently is limited to a few major packaging systems (including npm, Cargo, and maven), but it will be expanded.
  • Quantum computing’s first application will be in researching quantum mechanics: understanding the chemistry of batteries, drugs, and materials. In these applications, noise is an asset, not a problem.

Hand Labeling Considered Harmful [Radar]

We are traveling through the era of Software 2.0, in which the key components of modern software are increasingly determined by the parameters of machine learning models, rather than hard-coded in the language of for loops and if-else statements. There are serious challenges with such software and models, including the data they’re trained on, how they’re developed, how they’re deployed, and their impact on stakeholders. These challenges commonly result in both algorithmic bias and lack of model interpretability and explainability.

There’s another critical issue, which is in some ways upstream to the challenges of bias and explainability: while we seem to be living in the future with the creation of machine learning and deep learning models, we are still living in the Dark Ages with respect to the curation and labeling of our training data: the vast majority of labeling is still done by hand.

There are significant issues with hand labeling data:

  • It introduces bias, and hand labels are neither interpretable nor explainable.
  • There are prohibitive costs to hand labeling datasets (both financial costs and the time of subject matter experts).
  • There is no such thing as gold labels: even the most well-known hand labeled datasets have label error rates of at least 5% (ImageNet has a label error rate of 5.8%!).

We are living through an era in which we get to decide how human and machine intelligence interact to build intelligent software to tackle many of the world’s toughest challenges. Labeling data is a fundamental part of human-mediated machine intelligence, and hand labeling is not only the most naive approach but also one of the most expensive (in many senses) and most dangerous ways of bringing humans in the loop. Moreover, it’s just not necessary as many alternatives are seeing increasing adoption. These include:

  • Semi-supervised learning
  • Weak supervision
  • Transfer learning
  • Active learning
  • Synthetic data generation

These techniques are part of a broader movement known as Machine Teaching, a core tenet of which is getting both humans and machines each doing what they do best. We need to use expertise efficiently: the financial cost and time taken for experts to hand-label every data point can break projects, such as diagnostic imaging involving life-threatening conditions and security and defense-related satellite imagery analysis. Hand labeling in the age of these other technologies is akin to scribes hand-copying books post-Gutenberg.

There is also a burgeoning landscape of companies building products around these technologies, such as Watchful (weak supervision and active learning; disclaimer: one of the authors is CEO of Watchful), Snorkel (weak supervision), Prodigy (active learning), Parallel Domain (synthetic data), and AI Reverie (synthetic data).

Hand Labels and Algorithmic Bias

As Deb Raji, a Fellow at the Mozilla Foundation, has pointed out, algorithmic bias “can start anywhere in the system—pre-processing, post-processing, with task design, with modeling choices, etc.,” and the labeling of data is a crucial point at which bias can creep in.


Figure 1: Bias can start anywhere in the system. Image adapted from A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle by Harini Suresh and John Guttag.

High-profile cases of bias in training data resulting in harmful models include an Amazon recruiting tool that “penalized resumes that included the word ‘women’s,’ as in ‘women’s chess club captain.’” Don’t take our word for it. Play the educational game Survival of the Best Fit where you’re a CEO who uses a machine learning model to scale their hiring decisions and see how the model replicates the bias inherent in the training data. This point is key: as humans, we possess all types of biases, some harmful, others not so. When we feed hand labeled data to a machine learning model, it will detect those patterns and replicate them at scale. This is why David Donoho astutely observed that perhaps we should call ML models recycled intelligence rather than artificial intelligence. Of course, given the amount of bias in hand labeled data, it may be more apt to refer to it as recycled stupidity (hat tip to artificial stupidity).

The only way to interrogate the reasons for underlying bias arising from hand labels is to ask the labelers themselves their rationales for the labels in question, which is impractical, if not impossible, in the majority of cases: there are rarely records of who did the labeling, it is often outsourced via at-scale global APIs, such as Amazon’s Mechanical Turk and, when labels are created in-house, previous labelers are often no longer part of the organization.

Uninterpretable, Unexplainable

This leads to another key point: the lack of both interpretability and explainability in models built on hand labeled data. These are related concepts, and broadly speaking, interpretability is about correlation, whereas explainability is about causation. The former involves thinking about which features are correlated with the output variable, while the latter is concerned with why certain features lead to particular labels and predictions. We want models that give us results we can explain and some notion of how or why they work. For example, in the ProPublica exposé of COMPAS recidivism risk model, which made more false predictions that Black people would re-offend than it did for white people, it is essential to understand why the model is making the predictions it does. Lack of explainability and transparency were key ingredients of all the deployed-at-scale algorithms identified by Cathy O’Neil in Weapons of Math Destruction.

It may be counterintuitive that getting machines more in-the-loop for labeling can result in more explainable models but consider several examples:

  • There is a growing area of weak supervision, in which SMEs specify heuristics that the system then uses to make inferences about unlabeled data, the system calculates some potential labels, and then the SME evaluates the labels to determine where more heuristics might need to be added or tweaked. For example, when building a model of whether surgery was necessary based on medical transcripts, the SME may provide the following heuristic: if the transcription contains the term “anaesthesia” (or a regular expression similar to it), then surgery likely occurred (check out Russell Jurney’s “Hand labeling is the past” article for more on this).
  • In diagnostic imaging, we need to start cracking open the neural nets (such as CNNs and transformers)! SMEs could once again use heuristics to specify that tumors smaller than a certain size and/or of a particular shape are benign or malignant and, through such heuristics, we could drill down into different layers of the neural network to see what representations are learned where.
  • When your knowledge (via labels) is encoded in heuristics and functions, as above, this also has profound implications for models in production. When data drift inevitably occurs, you can return to the heuristics encoded in functions and edit them, instead of continually incurring the costs of hand labeling.

On Auditing

Amidst the increasing concern about model transparency, we are seeing calls for algorithmic auditing. Audits will play a key role in determining how algorithms are regulated and which ones are safe for deployment. One of the barriers to auditing is that high-performing models, such as deep learning models, are notoriously difficult to explain and reason about. There are several ways to probe this at the model level (such as SHAP and LIME), but that only tells part of the story. As we have seen, a major cause of algorithmic bias is that the data used to train it is biased or insufficient in some way.

There currently aren’t many ways to probe for bias or insufficiency at the data level. For example, the only way to explain hand labels in training data is to talk to the people who labeled it. Active learning, on the other hand, allows for the principled creation of smaller datasets which have been intelligently sampled to maximize utility for a model, which in turn reduces the overall auditable surface area. An example of active learning would be the following: instead of hand labeling every data point, the SME can label a representative subset of the data, which the system uses to make inferences about the unlabeled data. Then the system will ask the SME to label some of the unlabeled data, cross-check its own inferences and refine them based on the SME’s labels. This is an iterative process that terminates once the system reaches a target accuracy. Less data means less headache with respect to auditability.

Weak supervision more directly encodes expertise (and hence bias) as heuristics and functions, making it easier to evaluate where labeling went awry. For more opaque methods, such as synthetic data generation, it might be a bit difficult to interpret why a particular label was applied, which may actually complicate an audit. The methods we choose at this stage of the pipeline are important if we want to make sure the system as a whole is explainable.

The Prohibitive Costs of Hand Labeling

There are significant and differing forms of costs associated with hand labeling. Giant industries have been erected to deal with the demand for data-labeling services. Look no further than Amazon Mechanical Turk and all other cloud providers today. It is telling that data labeling is becoming increasingly outsourced globally, as detailed by Mary Gray in Ghost Work, and there are increasingly serious concerns about the labor conditions under which hand labelers work around the globe.

The sheer amount of capital involved was evidenced by Scale AI raising $100 million in 2019 to bring their valuation to over $1 billion at a time when their business model solely revolved around using contractors to hand label data (it is telling that they’re now doing more than solely hand labels).

Money isn’t the only cost, and quite often, isn’t where the bottleneck or rate-limiting step occurs. Rather, it is the bandwidth and time of experts that is the scarcest resource. As a scarce resource, this is often expensive but, much of the time it isn’t even available (on top of this, the time it also takes to correct errors in labeling by data scientists is very expensive). Take financial services, for example, and the question of whether or not you should invest in a company based on information about the company scraped from various sources. In such a firm, there will only be a small handful of people who can make such a call, so labeling each data point would be incredibly expensive, and that’s if the SME even has the time.

This is not vertical-specific. The same challenge occurs in labeling legal texts for classification: is this clause talking about indemnification or not? And in medical diagnosis: is this tumor benign or malignant? As dependence on expertise increases, so does the likelihood that limited access to SMEs becomes a bottleneck.

The third cost is a cost to accuracy, reality, and ground truth: the fact that hand labels are often so wrong. The authors of a recent study from MIT identified “label errors in the test sets of 10 of the most commonly-used computer vision, natural language, and audio datasets.” They estimated an average error rate of 3.4% across the datasets and show that ML model performance increases significantly once labels are corrected, in some instances. Also, consider that in many cases ground truth isn’t easy to find, if it exists at all. Weak supervision makes room for these cases (which are the majority) by assigning probabilistic labels without relying on ground truth annotations. It’s time to think statistically and probabilistically about our labels. There is good work happening here, such as Aka et al.’s (Google) recent paper Measuring Model Biases in the Absence of Ground Truth.

The costs identified above are not one-off. When you train a model, you have to assume you’re going to train it again if it lives in production. Depending on the use case, that could be frequent. If you’re labeling by hand, it’s not just a large upfront cost to build a model. It is a set of ongoing costs each and every time.


Figure 2: There are no “gold labels”: even the most well-known hand labeled datasets have label error rates of at least 5% (ImageNet has a label error rate of 5.8%!).

The Efficacy of Automation Techniques

In terms of performance, even if getting machines to label much of your data results in slightly noisier labels, your models are often better off with 10 times as many slightly noisier labels. To dive a bit deeper into this, there are gains to be made by increasing training set size even if it means reducing overall label accuracy, but if you’re training classical ML models, only up to a point (past this point the model starts to see a dip in predictive accuracy). “Scaling to Very Very Large Corpora for Natural Language Disambiguation (Banko & Brill, 2001)” demonstrates this in a traditional ML setting by exploring the relationship between hand labeled data, automatically labeled data, and subsequent model performance. A more recent paper, “Deep Learning Scaling Is Predictable, Empirically (2017)”, explores the quantity/quality relationship relative to modern state of the art model architectures, illustrating the fact that SOTA architectures are data hungry, and accuracy improves as a power law as training sets grow:

We empirically validate that DL model accuracy improves as a power-law as we grow training sets for state-of-the-art (SOTA) model architectures in four machine learning domains: machine translation, language modeling, image processing, and speech recognition. These power-law learning curves exist across all tested domains, model architectures, optimizers, and loss functions.

The key question isn’t “should I hand label my training data or should I label it programmatically?” It should instead be “which parts of my data should I hand label and which parts should I label programmatically?” According to these papers, by introducing expensive hand labels sparingly into largely programmatically generated datasets, you can maximize the effort/model accuracy tradeoff on SOTA architectures that wouldn’t be possible if you had hand labeled alone.

The stacked costs of hand labeling wouldn’t be so challenging were they necessary, but the fact of the matter is that there are so many other interesting ways to get human knowledge into models. There’s still an open question around where and how we want humans in the loop and what’s the right design for these systems. Areas such as weak supervision, self-supervised learning, synthetic data generation, and active learning, for example, along with the products that implement them, provide promising avenues for avoiding the pitfalls of hand labeling. Humans belong in the loop at the labeling stage, but so do machines. In short, it’s time to move beyond hand labels.


Many thanks to Daeil Kim for feedback on a draft of this essay.

Two economies. Two sets of rules. [Radar]

At one point early this year, Elon Musk briefly became the richest person in the world. After a 750% increase in Tesla’s stock market value added over $180 billion to his fortune, he briefly had a net worth of over $200 billion. It’s now back down to “only” $155 billion.

Understanding how our economy produced a result like this—what is good about it and what is dangerous—is crucial to any effort to address the wild inequality that threatens to tear our society apart.

The betting economy versus the operating economy

In response to the news of Musk’s surging fortune, Bernie Sanders tweeted:

Wealth of Elon Musk on March 18, 2020: $24.5 billion Wealth of Elon Musk on January 9, 2021: $209 billion U.S. minimum wage in 2009: $7.25 an hour U.S. minimum wage in 2021: $7.25 an hour Our job: Raise the minimum wage to at least $15, tax the rich & create an economy for all.

Bernie was right that a $7.25 minimum wage is an outrage to human decency. If the minimum wage had kept up with increases in productivity since 1979, it would be over $24 by now, putting a two-worker family into the middle class. But Bernie was wrong to imply that Musk’s wealth increase was at the expense of Tesla’s workers. The median Tesla worker makes considerably more than the median American worker.

Elon Musk’s wealth doesn’t come from him hoarding Tesla’s extractive profits, like a robber baron of old. For most of its existence, Tesla had no profits at all. It became profitable only last year. But even in 2020, Tesla’s profits of $721 million on $31.5 billion in revenue were small—only slightly more than 2% of sales, a bit less than those of the average grocery chain, the least profitable major industry segment in America.

No, Musk won the lottery, or more precisely, the stock market beauty contest. In theory, the price of a stock reflects a company’s value as an ongoing source of profit and cash flow. In practice, it is subject to wild booms and busts that are unrelated to the underlying economics of the businesses that shares of stock are meant to represent.

Why is Musk so rich? The answer tells us something profound about our economy: he is wealthy because people are betting on him. But unlike a bet in a lottery or at a racetrack, in the vast betting economy of the stock market, people can cash out their winnings before the race has ended.

This is one of the biggest unacknowledged drivers of inequality in America, the reason why one segment of our society prospered so much during the pandemic while the other languished.

What are the odds?

If the stock market is like a horse race where people can cash out their bets while the race is still being run, what does it mean for the race to finish? For an entrepreneur or an early-stage investor, an IPO is a kind of finish, the point where they can sell previously illiquid shares on to others. An acquisition or a shutdown, either of which puts an end to a company’s independent existence, is another kind of ending. But it is also useful to think of the end of the race as the point in time at which the stream of company profits will have repaid the investment.

Since ownership of public companies is spread across tens of thousands of people and institutions, it’s easier to understand this point by imagining a small private company with one owner, say, a home construction business or a storage facility or a car wash. If it cost $1 million to buy the business, and it delivered $100,000 of profit a year, the investment would be repaid in 10 years. If it delivered $50,000 in profit, it would take 20. And of course, those future earnings would need to be discounted at some rate, since a dollar received 20 years from now is not worth as much as a dollar received today. This same approach works, in theory, for large public companies. Each share is a claim on a fractional share of the company’s future profits and the present value that people put on that profit stream.

This is, of course, a radical oversimplification. There are many more sophisticated ways to value companies, their assets, and their prospects for future streams of profits. But what I’ve described above is one of the oldest, the easiest to understand, and the most clarifying. It is called the price/earnings ratio, or simply the P/E ratio. It’s the ratio between the price of a single share of stock and the company’s earnings per share (its profits divided by the number of shares outstanding.) What the P/E ratio gives, in effect, is a measure of how many years of current profits it would take to pay back the investment.

The rate of growth also plays a role in a company’s valuation. For example, imagine a business with $100 million in revenue with a 10% profit margin, earning $10 million a year. How much it is worth to own that asset depends how fast it is growing and what stage of its lifecycle it is in when you bought it. If you were lucky enough to own that business when it had only $1 million in revenue and, say, $50,000 in profits, you would now be earning 200x as much as you were when you made your original investment. If a company grows to hundreds of billions in revenue and tens of billions in profits, as Apple, Microsoft, Facebook, and Google have done, even a small investment early on that is held for the long haul can make its lucky owner into a billionaire. Tesla might be one of these companies, but if so, the opportunity to buy its future is long past because it is already so highly valued. The P/E ratio helps you to understand the magnitude of the bet you are making at today’s prices.

The average P/E ratio of the S&P 500 has varied over time as “the market” (the aggregate opinion of all investors) goes from bullish about the future to bearish, either about specific stocks or about the market as a whole. Over the past 70 years, the ratio has ranged from a low of 7.22 in 1950 to almost 45 today. (A note of warning: it was only 17 on the eve of the Great Depression.)

What today’s P/E ratio of 44.8 means that, on average, the 500 companies that make up the S&P 500 are valued at about 45 years’ worth of present earnings. Most companies in the index are worth less, and some far more. In today’s overheated market, it is often the case that the more certain the outcome the less valuable a company is considered to be. For example, despite their enormous profits and huge cash hoards, Apple, Google, and Facebook have ratios much lower than you might expect: about 30 for Apple, 34 for Google, and 28 for Facebook. Tesla at the moment of Elon Musk’s peak wealth? 1,396.

Let that sink in. You’d have had to wait almost 1,400 years to get your money back if you’d bought Tesla stock this past January and simply relied on taking home a share of its profits. Tesla’s more recent quarterly earnings are a bit higher, and its stock price quite a bit lower, so now you’d only have to wait about 600 years.

Of course, it’s certainly possible that Tesla will so dominate the auto industry and related energy opportunities that its revenues could grow from its current $28 billion to hundreds of billions with a proportional increase in profits. But as Rob Arnott, Lillian Wu, and Bradford Cornell point out in their analysis “Big Market Delusion: Electric Vehicles,” electric vehicle companies are already valued at roughly the same amount as the entire rest of the auto industry despite their small revenues and profits and despite the likelihood of more, rather than less, competition in future. Barring some revolution in the fundamental economics of the business, current investors are likely paying now for the equivalent of hundreds of years of future profits.

So why do investors do this? Simply put: because they believe that they will be able to sell their shares to someone else at an even higher price. In times where betting predominates in financial markets, what a company is actually worth by any intrinsic measure seems to have no more meaning than the actual value of tulips during the 17th century Dutch “tulip mania.” As the history of such moments teaches, eventually the bubble does pop.

This betting economy, within reason, is a good thing. Speculative investment in the future gives us new products and services, new drugs, new foods, more efficiency and productivity, and a rising standard of living. Tesla has kickstarted a new gold rush in renewable energy, and given the climate crisis, that is vitally important. A betting fever can be a useful collective fiction, like money itself (the value ascribed to pieces of paper issued by governments) or the wild enthusiasm that led to the buildout of railroads, steel mills, or the internet. As economist Carlota Perez has noted, bubbles are a natural part of the cycle by which revolutionary new technologies are adopted.

Sometimes, though, the betting system goes off the rails. Tesla’s payback may take centuries, but it is the forerunner of a necessary industrial transformation. But what about the payback on companies such as WeWork? How about Clubhouse? Silicon Valley is awash in companies that have persuaded investors to value them at billions despite no profits, no working business model, and no pathway to profitability. Their destiny, like WeWork’s or Katerra’s, is to go bankrupt.

John Maynard Keynes, the economist whose idea that it was essential to invest in the demand side of the economy and not just the supply side helped bring the world out of the Great Depression, wrote in his General Theory of Employment, Interest and Money, “Speculators may do no harm as bubbles on a steady stream of enterprise. But the position is serious when enterprise becomes the bubble on a whirlpool of speculation. When the capital development of a country becomes a by-product of the activities of a casino, the job is likely to be ill-done.”

In recent decades, we have seen the entire economy lurch from one whirlpool of speculation to another. And as at the gambling table, each lurch represents a tremendous transfer of wealth from the losers to the winners. The dot-com bust. The subprime mortgage meltdown. Today’s Silicon Valley “unicorn” bubble. The failures to deliver on their promises by WeWork, Katerra, and their like are just the start of yet another bubble popping.

Why this matters

Those at the gaming table can, for the most part, afford to lose. They are disproportionately wealthy. Nearly 52% of stock market value is held by the top 1% of Americans, with another 35% of total market value held by the next 9%. The bottom 50% hold only 0.7% of stock market wealth.

Bubbles, though, are only an extreme example of a set of dynamics that shape our economy far more widely than we commonly understand. The leverage provided by the betting economy drives us inevitably toward a monoculture of big companies. The local bookstore trying to compete with Amazon, the local cab company competing with Uber, the neighborhood dry cleaner, shopkeeper, accountant, fitness studio owner, or any other local, privately held business gets exactly $1 for every dollar of profit it earns. Meanwhile, a dollar of Tesla profit turns into $600 of stock market value; a dollar of Amazon profit turns into $67 of stock market value; a dollar of Google profit turns into $34, and so on. A company and its owners can extract massive amounts of value despite having no profits—value that can be withdrawn by those who own shares—essentially getting something for nothing.

And that, it turns out, is also one underappreciated reason why in the modern economy, the rich get richer and the poor get poorer. Rich and poor are actually living in two different economies, which operate by different rules. Most ordinary people live in a world where a dollar is a dollar. Most rich people live in a world of what financial pundit Jerry Goodman, writing under the pseudonym Adam Smith, called “supermoney,” where assets have been “financialized” (that is, able to participate in the betting economy) and are valued today as if they were already delivering the decades worth of future earnings that are reflected in their stock price.

Whether you are an hourly worker or a small business owner, you live in the dollar economy. If you’re a Wall Street investor, an executive at a public company compensated with stock grants or options, a venture capitalist, or an entrepreneur lucky enough to win, place, or show in the financial market horse race, you live in the supermoney economy. You get a huge interest-free loan from the future.

Elon Musk has built not one but two world-changing companies (Tesla and SpaceX.) He clearly deserves to be wealthy. As does Jeff Bezos, who quickly regained his title as the world’s wealthiest person. Bill Gates, Steve Jobs, Larry Page and Sergey Brin, Mark Zuckerberg, and many other billionaires changed our world and have been paid handsomely for it.

But how much is too much? When Bernie Sanders said that billionaires shouldn’t exist, Mark Zuckerberg agreed, saying, “On some level, no one deserves to have that much money.” He added, “I think if you do something that’s good, you get rewarded. But I do think some of the wealth that can be accumulated is unreasonable.” Silicon Valley was founded by individuals for whom hundreds of millions provided plenty of incentive! The notion that entrepreneurs will stop innovating if they aren’t rewarded with billions is a pernicious fantasy.

What to do about it

Taxing the rich and redistributing the proceeds might seem like it would solve the problem. After all, during the 1950s, ’60s, and ’70s, progressive income tax rates as high as 90% did a good job of redistributing wealth and creating a broad-based middle class. But we also need to put a brake on the betting economy that is creating so much phantom wealth by essentially letting one segment of society borrow from the future while another is stuck in an increasingly impoverished present.

Until we recognize the systemic role that supermoney plays in our economy, we will never make much of a dent in inequality. Simply raising taxes is a bit like sending out firefighters with hoses spraying water while another team is spraying gasoline.

The problem is that government policy is biased in favor of supermoney. The mandate for central bankers around the world is to keep growth rates up without triggering inflation. Since the 2009 financial crisis, they have tried to do this by “quantitative easing,” that is, flooding the world with money created out of nothing. This has kept interest rates low, which in theory should have sparked investment in the operating economy, funding jobs, factories, and infrastructure. But far too much of it went instead to the betting economy.

Stock markets have become so central to our imagined view of how the economy is doing that keeping stock prices going up even when companies are overvalued has become a central political talking point. Any government official whose policies cause the stock market to go down is considered to have failed. This leads to poor public policy as well as poor investment decisions by companies and individuals.

As Steven Pearlstein, Washington Post columnist and author of the book Moral Capitalism, put it in a 2020 column:

When the markets are buoyant, Fed officials claim that central bankers should never second-guess markets by declaring that there are financial bubbles that might need to be deflated. Markets on their own, they assure, will correct whatever excesses may develop.

But when bubbles burst or markets spiral downward, the Fed suddenly comes around to the idea that markets aren’t so rational and self-correcting and that it is the Fed’s job to second-guess them by lending copiously when nobody else will.

In essence, the Fed has adopted a strategy that works like a one-way ratchet, providing a floor for stock and bond prices but never a ceiling.

That’s the fire hose spraying gasoline. To turn it off, central banks should:

  • Raise interest rates, modestly at first, and more aggressively over time. Yes, this would quite possibly puncture the stock market bubble, but that could well be a good thing. If people can no longer make fortunes simply by betting that stocks will go up and instead have to make more reasonable assessments of the underlying value of their investments, the market will become better at allocating capital.
  • Alternatively, accept much larger increases in inflation. As Thomas Piketty explained in Capital in the Twenty-First Century, inflation is one of the prime forces that decreases inequality, reducing the value of existing assets and more importantly for the poor, reducing the value of debt and the payments paid to service it.
  • Target small business creation, hiring, and profitability in the operating economy rather than phantom valuation increases for stocks.

Tax policy also fans the fire. Taxes shape the economy in much the same way as Facebook’s algorithms shape its news feed. The debate about whether taxes as a whole should be higher or lower completely lacks nuance and so misses the point, especially in the US, where elites use their financial and political power to get favored treatment. Here are some ideas:

In general, we should treat not just illegal evasion but tax loopholes the way software companies treat zero-day exploits, as something to be fixed as soon as they are recognized, not years or decades later. Even better, stop building them into the system in the first place! Most loopholes are backdoors installed knowingly by our representatives on behalf of their benefactors.

This last idea is perhaps the most radical. The tax system could and should become more dynamic rather than more predictable. Imagine if Facebook or Google were to tell us that they couldn’t change their algorithms to address misinformation or spam without upsetting their market and so had to leave abuses in place for decades in the interest of maintaining stability—we’d think they were shirking their duty. So too our policy makers. It’s high time we all recognize the market-shaping role of tax and monetary policy. If we can hold Facebook’s algorithms to account, why can’t we do the same for our government?

Our society and markets are getting the results the algorithm was designed for. Are they the results we actually want?

Communal Computing [Radar]

Home assistants and smart displays are being sold in record numbers, but they are built wrong. They are designed with one person in mind: the owner. These technologies need to fit into the communal spaces where they are placed, like homes and offices. If they don’t fit, they will be unplugged and put away due to lack of trust.

The problems are subtle at first. Your Spotify playlist starts to have recommendations for songs you don’t like. You might see a photo you took on someone else’s digital frame. An Apple TV reminds you of a new episode of a show your partner watches. Guests are asking you to turn on your IoT-enabled lights for them. The wrong person’s name shows up in the Zoom call. Reminders for medication aren’t heard by the person taking the medication. Bank account balances are announced during a gathering of friends.

Would you want your bank account balances announced during a dinner party?

This is the start of a series discussing the design of communal devices–devices designed to work in communal spaces. The series is a call to action for everyone developing communal devices–whether you are creating business cases, designing experiences, or building technology–to take a step back and consider what is really needed.

This first article discusses what communal devices are, and how problems that appear result from our assumptions about how they’re used. Those assumptions were inherited from the world of PCs: the rules that apply to your laptop or your iPad just don’t apply to home assistants and other “smart devices,” from light bulbs to refrigerators.  It isn’t just adding the ability for people to switch accounts. We need a new paradigm for the future of technical infrastructure for our homes and offices. In this series of articles we will tell you how we got here, why it is problematic, and where to go to enable communal computing.

The Wrong Model

Problems with communal devices arise because the industry has focused on a specific model for how these devices are used: a single person buys, sets up, and uses the device. If you bought one of these devices (for example, a smart speaker) recently, how many other people in your household did you involve in setting it up?

Smart screen makers like Amazon and Google continue to make small changes to try to fix the weirdness. They have recently added technology to automatically personalize based on someone’s face or voice. These are temporary fixes that will only be effective until the next special case reveals itself. Until the industry realizes the communal nature of users’ needs they will just be short lived patches. We need to turn the model around to make the devices communal first, rather than communal as an afterthought.

I recently left Facebook Reality Labs, where I was working on the Facebook Portal identity platform, and realized that there was zero discourse about this problem in the wider world of technology. I’ve read through many articles on how to create Alexa skills and attended talks about the use of IoT, and I’ve even made my own voice skills. There was no discussion of the communal impacts of those technologies. If we don’t address the problems this creates, these devices will be relegated to a small number of uses, or unplugged to make room for the next one. The problems were there, just beneath the shiny veneer of new technologies.

Communal began at home

Our home infrastructure was originally communal. Consider a bookcase: someone may have bought it, but anyone in the household could update it with new books or tchotchkes. Guests could walk up to browse the books you had there. It was meant to be shared with the house and those that had access to it.

The old landline in your kitchen is the original communal device.

Same for the old landline that was in the kitchen. When you called, you were calling a household. You didn’t know specifically who would pick up. Anyone who was part of that household could answer. We had protocols for getting the phone from the person who answered the call to the intended recipient. Whoever answered could either yell for someone to pick up the phone elsewhere in the home, or take a message. If the person answering the phone wasn’t a member of the household, it would be odd, and you’d immediately think “wrong number.”

It wasn’t until we had the user model for mainframe time sharing that we started to consider who was using a computer. This evolved into full login systems with passwords, password reset, two factor authentication, biometric authentication, and more. As computers became more common,  what made sense inside of research and academic institutions was repurposed for the office.

In the 1980s and 1990s a lot of homes got their first personal computer. These were shared, communal devices, though more by neglect than by intention. A parent would purchase it and then set it up in the living room so everyone could use it. The account switching model wasn’t added until visual systems like Windows arrived, but account management was poorly designed and rarely used. Everyone just piggybacked on each other’s access. If anyone wanted privacy, they had to lock folders with a password or hide them in an endless hierarchy.

Early Attempts at Communal Computing

Xerox-PARC started to think about what more communal or ubiquitous computing would mean. However, they focused on fast account switching. They were answering the question: how could I get the personal context to this communal device as fast as possible? One project was digitizing the whiteboard, a fundamentally communal device. It was called The Colab and offered a way for anyone to capture content in a meeting room and then walk it around the office to other shared boards.

Not only did the researchers at PARC think about sharing computers for presentations, they also wondered how they could have someone walk up to a computer and have it be configured for them automatically. It was enabled by special cards called “Active Badges,” described in “A New Location Technique for the Active Office.” The paper starts with an important realization:

“…researchers have begun to examine computers that would autonomously change their functionality based on observations of who or what was around them. By determining their context, using input from sensor systems distributed throughout the environment, computing devices could personalize themselves to their current user, adapt their behaviour according to their location, or react to their surroundings.”

Understanding the context around the device is very important in building a system that adapts. At this point, however, researchers were still thinking about a ‘current user’ and their position relative to the system, rather than the many people who could be nearby.

Even Bill Gates had communal technology in his futuristic home back then. He would give every guest a pin to put on their person that would allow them to personalize the lighting, temperature, and music as they went from room to room. Most of these technologies didn’t go anywhere, but they were an attempt at making the infrastructure around us adapt to the people who were in the space.  The term “ubiquitous computing” (also known as “pervasive computing”) was coined to discuss the installation of sensors around a space; the ideas behind ubiquitous computing later led to the Internet of Things (IoT).

Communal Computing Comes Home

When the late 2000s rolled around, we found that everyone wanted their own personal computing device, most likely an iPhone. Shared home PCs started to die. The prevalence of smartphones and personal laptops killed the need for shared home PCs. The drive goal to provide information and communication services conveniently wherever the users happened to be, including if they’re sitting together on their couches.

When the Amazon Echo with Alexa was released, they were sold to individuals with Amazon accounts, but they were clearly communal devices. Anyone could ask their Echo a question, and it would answer. That’s where the problem starts.  Although Echo is a communal device, its user model wasn’t significantly different than the early PCs: one account, one user, shared by everyone in the household. As a result, items being mistakenly ordered by children made Amazon pull back some features that were focused on shopping. Echo’s usage ended up being driven by music and weather.

With the wild success of the Echo and the proliferation of Alexa-enabled devices, there appeared a new device market for home assistants, some just for audio and others with screens. Products from Apple (HomePod with Siri), Google (Home Hub), and Facebook (Portal) followed. This includes less interactive devices like digital picture frames from Nixplay, Skylight, and others.

Ambient Computing

Ambient computing” is a term that has been coined to talk about digital devices blending into the infrastructure of the environment. A recent paper by Map Project Office focused on how “ambient tech brings the outside world into your home in new ways, where information isn’t being channelled solely through your smartphone but rather a series of devices.” We take a step back from screens and wonder how the system itself is the environment.

The concept of ambient computing is related to the focus of marketing organizations on omnichannel experiences. Omnichannel is the fact that people don’t want to start and end experiences on the same device. I might start looking for travel on a smartphone but will not feel comfortable booking a trip until I’m on a laptop. There is different information and experience needed for these devices. When I worked at KAYAK, some people were afraid of buying $1,000 plane tickets on a mobile device, even though they found it there. The small screen made them feel uncomfortable because they didn’t have enough information to make a decision. We found that they wanted to finalize the plans on the desktop.

Ambient computing takes this concept and combines voice-controlled interfaces with sensor interfaces–for example, in devices like automatic shades that close or open based on the temperature. These devices are finding traction, but we can’t forget all of the other communal experiences that already exist in the world:

Device or object Why is this communal?
Home automation and IoT like light bulbs and thermostats  Anyone with home access can use controls on device, home assistants, or personal apps
iRobot’s Roomba People walking by can start or stop a cleaning through the ‘clean’ or ‘home’ buttons
Video displays in office meeting rooms Employees and guests can use the screens for sharing their laptops and video conferencing systems for calling
Digital whiteboards Anyone with access can walk up and start writing
Ticketing machines for public transport All commuters buy and refill stored value cards without logging into an account
Car center screens for entertainment Drivers (owners or borrowers) and passengers can change what they are listening to
Smartphone when two people are watching a video Anyone in arm’s reach can pause playback
Group chat on Slack or Discord People are exchanging information and ideas in a way that is seen by everyone
Even public transportation ticketing machines are communal devices.

All of these have built experience models that need a specific, personal context and rarely consider everyone who could have access to them. To rethink the way that we build these communal devices, it is important that we understand this history and refocus the design on key problems that are not yet solved for communal devices.

Problems with single user devices in the home

After buying a communal device, people notice weirdness or annoyances. They are symptoms of something much larger: core problems and key questions that should have considered the role of communities rather than individuals. Here are some of those questions:

  1. Identity: do we know all of the people who are using the device?
  2. Privacy: are we exposing (or hiding) the right content for all of the people with access?
  3. Security: are we allowing all of the people using the device to do or see what they should and are we protecting the content from people that shouldn’t?
  4. Experience: what is the contextually appropriate display or next action?
  5. Ownership: who owns all of the data and services attached to the device that multiple people are using?

If we don’t address these communal items, users will lose trust in their devices. They will be used for a few key things like checking the weather, but go unused for a majority of the day. They are eventually removed when another, newer device needs the plug. Then the cycle starts again. The problems keep happening and the devices keep getting recycled.

In the following articles we will dive into how these problems manifest themselves across these domains and reframe the system with dos and don’ts for building communal devices.


Thanks

Thanks to Adam Thomas, Mark McCoy, Hugo Bowne-Anderson, and Danny Nou for their thoughts and edits on the early draft of this. Also, from O’Reilly, Mike Loukides for being a great editor and Susan Thompson for the art.

Code as Infrastructure [Radar]

A few months ago, I was asked if there were any older technologies other than COBOL where we were in serious danger of running out of talent. They wanted me to talk about Fortran, but I didn’t take the bait. I don’t think there will be a critical shortage of Fortran programmers now or at any time in the future. But there’s a bigger question lurking behind Fortran and COBOL: what are the ingredients of a technology shortage? Why is running out of COBOL programmers a problem?

The answer, I think, is fairly simple. We always hear about the millions (if not billions) of lines of COBOL code running financial and government institutions, in many cases code that was written in the 1960s or 70s and hasn’t been touched since. That means that COBOL code is infrastructure we rely on, like roads and bridges. If a bridge collapses, or an interstate highway falls into disrepair, that’s a big problem. The same is true of the software running banks.

Fortran isn’t the same. Yes, the language was invented in 1957, two years earlier than COBOL. Yes, millions of lines of code have been written in it. (Probably billions, maybe even trillions.) However, Fortran and COBOL are used in fundamentally different ways. While Fortran was used to create infrastructure, software written in Fortran isn’t itself infrastructure. (There are some exceptions, but not at the scale of COBOL.) Fortran is used to solve specific problems in engineering and science. Nobody cares anymore about the Fortran code written in the 60s, 70s, and 80s to design new bridges and cars. Fortran is still heavily used in engineering—but that old code has retired. Those older tools have been reworked and replaced.  Libraries for linear algebra are still important (LAPACK), some modeling applications are still in use (NEC4, used to design antennas), and even some important libraries used primarily by other languages (the Python machine learning library scikit-learn calls both NumPy and SciPy, which in turn call LAPACK and other low level mathematical libraries written in Fortran and C). But if all the world’s Fortran programmers were to magically disappear, these libraries and applications could be rebuilt fairly quickly in modern languages—many of which already have excellent libraries for linear algebra and machine learning. The continued maintenance of Fortran libraries that are used primarily by Fortran programmers is, almost by definition, not a problem.

If shortages of COBOL programmers are a problem because COBOL code is infrastructure, and if we don’t expect shortages of Fortran talent to be a problem because Fortran code isn’t infrastructure, where should we expect to find future crises? What other shortages might occur?

When you look at the problem this way, it’s a no-brainer. For the past 15 years or so, we’ve been using the slogan “infrastructure as code.” So what’s the code that creates the infrastructure? Some of it is written in languages like Python and Perl. I don’t think that’s where shortages will appear. But what about the configuration files for the systems that manage our complex distributed applications? Those configuration files are code, too, and should be managed as such.

Right now, companies are moving applications to the cloud en masse. In addition to simple lift and shift, they’re refactoring monolithic applications into systems of microservices, frequently orchestrated by Kubernetes. Microservices in some form will probably be the dominant architectural style for the foreseeable future (where “foreseeable” means at least 3 years, but probably not 20). The microservices themselves will be written in Java, Python, C++, Rust, whatever; these languages all have a lot of life left in them.

But it’s a safe bet that many of these systems will still be running 20 or 30 years from now; they’re the next generation’s “legacy apps.” The infrastructure they run on will be managed by Kubernetes—which may well be replaced by something simpler (or just more stylish). And that’s where I see the potential for a shortage—not now, but 10 or 20 years from now. Kubernetes configuration is complex, a distinct specialty in its own right. If Kubernetes is replaced by something simpler (which I think is inevitable), who will maintain the infrastructure that already relies on it? What happens when learning Kubernetes isn’t the ticket to the next job or promotion? The YAML files that configure Kubernetes aren’t a Turing-complete programming language like Python; but they are code. The number of people who understand how to work with that code will inevitably dwindle, and may eventually become a “dying breed.” When that happens, who will maintain the infrastructure? Programming languages have lifetimes measured in decades; popular infrastructure tools don’t stick around that long.

It’s not my intent to prophesy disaster or gloom. Nor is it my intention to critique Kubernetes; it’s just one example of a tool that has become critical infrastructure, and if we want to understand where talent shortages might arise, I’d look at critical infrastructure. Who’s maintaining the software we can’t afford not to run? If it’s not Kubernetes, it’s likely to be something else. Who maintains the CI/CD pipelines? What happens when Jenkins, CircleCI, and their relatives have been superseded? Who maintains the source archives?  What happens when git is a legacy technology?

Infrastructure as code: that’s a great way to build systems. It reflects a lot of hard lessons from the 1980s and 90s about how to build, deploy, and operate mission-critical software. But it’s also a warning: know where your infrastructure is, and ensure that you have the talent to maintain it.

Radar trends to watch: June 2021 [Radar]

The most fascinating idea this month is POET, a self-enclosed system in which bots that are part of the system overcome obstacles that are generated by the system. It’s a learning feedback loop that might conceivably be a route to much more powerful AI, if not general intelligence.

It’s also worth noting the large number of entries under security. Of course, security is a field lots of people talk about, but nobody ends up doing much. Is the attack against the Colonial pipeline going to change anything? We’ll see. And there’s one trend that’s notably absent. I didn’t include anything on cryptocurrency. That’s because, as far as I can tell, there’s no new technology; just a spike (and collapse) in the prices of the major currencies. If anything, it demonstrates how easily these currencies are manipulated.

AI

  • Using AI to create AI: POET is a completely automated virtual world in which software bots learn to navigate an obstacle course.  The navigation problems themselves are created by the world, in response to its evaluation of the robots’ performance. It’s a closed loop. Is it evolving towards general intelligence?
  • IBM is working on using AI to write software, focusing on code translation (e.g., COBOL to Java). They have released CodeNet, a database of 14 million samples of source code in many different programming languages. CodeNet is designed to train deep learning systems for software development tasks. Microsoft is getting into the game, too, with GPT-3.
  • Vertex AI is a “managed machine learning platform” that includes most of the tools developers need to train, deploy, and maintain models in an automated way. It claims to reduce the amount of code developers need to write by 80%.
  • Google announces LaMDA, a natural language model at GPT-3 scale that was trained specifically on dialog. Because it was trained in dialog rather than unrelated text, it can participate more naturally in conversations and appears to have a sense of context.
  • Automated data cleaning is a trend we started watching a few years ago with Snorkel. Now MIT has developed a tool that uses probabilistic programming to fix errors and omissions in data tables.
  • AI is becoming an important tool in product development, supplementing and extending the work of engineers designing complex systems. This may lead to a revolution in CAD tools that can predict and optimize a design’s performance.
  • Designing distrust into AI systems: Ayanna Howard is researching the trust people place in AI systems, and unsurprisingly, finding that people trust AI systems too much. Tesla accidents are only one symptom. How do you build systems that are designed not to be perceived as trustworthy?
  • Important lessons in language equity: While automated translation is often seen as a quick cure for supporting non-English speaking ethnic groups, low quality automated translations are a problem for medical care, voting, and many other systems. It is also hard to identify misinformation when posts are translated badly, leaving minorities vulnerable.
  • Andrew Ng has been talking about the difference between putting AI into production and getting it to work in the lab. That’s the biggest hurdle AI faces on the road to more widespread adoption. We’ve been saying for some time that it’s the unacknowledged elephant in the room.
  • According to The New Stack, the time needed to deploy a model has increased year over year, and at 38% of the companies surveyed, data scientists spend over half of their time in deployment. These numbers increase with the number of models.

Data

  • Collective data rights are central to privacy, and are rarely discussed. It’s easy, but misleading, to focus discussions on individual privacy, but the real problems and harms stem from group data. Whether Amazon knows your shoe size doesn’t really matter; what does matter is whether they can predict what large groups want, and force other vendors out of the market.
  • Mike Driscoll has been talking about the stack for Operational Intelligence. OI isn’t the same as BI; it’s about a real time understanding of the infrastructure that makes the business work, rather than day to day understanding of sales data and other financial metrics.
  • Deploying databases within containerized applications has long been difficult. DataStax and other companies have been evolving databases to work well inside containers. This article is  primarily about Cassandra and K8ssandra, but as applications move into the cloud, all databases will need to change.

Programming

  • Software developers are beginning to think seriously about making software sustainable. Microsoft, Accenture, Github, and Thoughtworks have created the Green Software Foundation, which is dedicated to reducing the carbon footprint required to build and run software. O’Reilly Media will be running an online conversation about cloud providers and sustainability.
  • Google has released a new open source operating system, Fuchsia, currently used only in its Home Hub.  Fuchsia is one of the few recent operating systems that isn’t Linux-based. Application programming is based on Flutter, and the OS is designed to be “invisible.”
  • A service mesh without proxies is a big step forward for building applications with microservices; it simplifies one of the most difficult aspects of coordinating services that are working together.
  • As much as they hate the term, unqork may be a serious contender for enterprise low-code. They are less interested in democratization and “citizen developers” than making the professional software developers more efficient.
  • The evolution of JAMstack: distributed rendering, streaming updates, and extending collaboration to non-developers.
  • Grain is a new programming language designed to target Web Assembly (wasm). It is strongly typed and, while not strictly functional, has a number of features from functional languages.
  • Grafar and Observable Plot are new JavaScript libraries for browser-based data visualization. Observable Plot was created by Mike Bostock, the author of the widely used D3 library.

Security

  • Morpheus is a microprocessor that randomly changes its architecture to foil attackers: This is a fascinating idea. In a 3-month long trial, 525 attackers were unable to crack it.
  • Self-sovereign identity combines decentralized identifiers with verifiable credentials that can be stored on devices. Credentials are answers to yes/no questions (for example, has the user been vaccinated for COVID-19).
  • A WiFi attack (now patched) against Teslas via the infotainment system doesn’t yield control of the car, but can take over everything that the infotainment system controls, including opening doors and changing seat positions. Clearly the infotainment system controls too much. Other auto makers are believed to use the same software in their cars.
  • Passphrases offer better protection than complex passwords with complex rules. This has been widely known for several years now. The important question is why companies aren’t doing anything about it. We know all too well that passwords are ineffective, and that forcing users to change passwords is an anti-pattern.
  • Fawkes and other tools for defeating face recognition work by adding small perturbations that confuse the algorithms. For the moment, at least. Face recognition systems already appear to be catching up.
  • Tracking phishing sites has always been a problem. Phish.report is a new service for reporting phishing sites, and notifying services that flag phishing sites.

Web and Social Media

  • Ben Evans has a great discussion of online advertising and customer acquisition in a post-Cookie world.
  • Models from epidemiology and the spread of viruses can be used to understand the spread of misinformation. The way disease spreads and the way misinformation spreads turn out to be surprisingly similar.
  • Google brings back RSS in Chrome? The implementation sounds awkward, and there have always been decent RSS readers around. But Google has clearly decided that they can’t kill it off–or that they don’t want web publishing to become even more centralized.
  • Video editing is exploding: YouTube has made that old news.  But it’s set to explode again, with new tools, new users, and increased desire for professional quality video on social media.
  • New York has passed a law requiring ISPs to provide broadband to poor families for $15/month. This provides 25 Mbps downloads; low income households can get high speed broadband for $20/month.

Hardware

  • Google, Apple, and Amazon back Matter, a standard for interoperability between smart home devices. A standard for interoperability is important, because nobody wants a “smart phone” where every appliance, from individual light bulbs to the media players, requires a separate app.
  • Moore’s law isn’t dead yet: IBM has developed 2 nanometer chip technology; the best widely used technology is currently 7nm. This technology promises lower power consumption and faster speeds.
  • Google plans to build a commercially viable error-corrected quantum computer by 2029. Error correction is the hard part. That will require on the order of 1 million physical qbits; current quantum computers have under 100 qbits.

Biology

  • The photo is really in bad taste, but researchers have developed a medical sensor chip so small that Bill Gates could actually put it into your vaccine! It’s powered by ultrasound, and uses ultrasound to transmit data.
  • With sensors implanted in his brain, a paralyzed man was able to “type” by imagining writing. AI decoded signals in his brain related to the intention to write (not the actual signals to his muscles). He was able to “type” at roughly 15 words per minute with a 5% error rate.

AI Powered Misinformation and Manipulation at Scale #GPT-3 [Radar]

OpenAI’s text generating system GPT-3 has captured mainstream attention. GPT-3 is essentially an auto-complete bot whose underlying Machine Learning (ML) model has been trained on vast quantities of text available on the Internet. The output produced from this autocomplete bot can be used to manipulate people on social media and spew political propaganda, argue about the meaning of life (or lack thereof), disagree with the notion of what differentiates a hot-dog from a sandwich, take upon the persona of the Buddha or Hitler or a dead family member, write fake news articles that are indistinguishable from human written articles, and also produce computer code on the fly. Among other things.

There have also been colorful conversations about whether GPT-3 can pass the Turing test, or whether it has achieved a notional understanding of consciousness, even amongst AI scientists who know the technical mechanics. The chatter on perceived consciousness does have merit–it’s quite probable that the underlying mechanism of our brain is a giant autocomplete bot that has learnt from 3 billion+ years of evolutionary data that bubbles up to our collective selves, and we ultimately give ourselves too much credit for being original authors of our own thoughts (ahem, free will).

I’d like to share my thoughts on GPT-3 in terms of risks and countermeasures, and discuss real examples of how I have interacted with the model to support my learning journey.

Three ideas to set the stage:

  1. OpenAI is not the only organization to have powerful language models. The compute power and data used by OpenAI to model GPT-n is available, and has been available to other corporations, institutions, nation states, and anyone with access to a computer desktop and a credit-card.  Indeed, Google recently announced LaMDA, a model at GPT-3 scale that is designed to participate in conversations.
  2. There exist more powerful models that are unknown to the general public. The ongoing global interest in the power of Machine Learning models by corporations, institutions, governments, and focus groups leads to the hypothesis that other entities have models at least as powerful as GPT-3, and that these models are already in use. These models will continue to become more powerful.
  3. Open source projects such as EleutherAI have drawn inspiration from GPT-3. These projects have created language models that are based on focused datasets (for example, models designed to be more accurate for academic papers, developer forum discussions, etc.). Projects such as EleutherAI are going to be powerful models for specific use cases and audiences, and these models are going to be easier to produce because they are trained on a smaller set of data than GPT-3.

While I won’t discuss LaMDA, EleutherAI, or any other models, keep in mind that GPT-3 is only an example of what can be done, and its capabilities may already have been surpassed.

Misinformation Explosion

The GPT-3 paper proactively lists the risks society ought to be concerned about. On the topic of information content, it says: “The ability of GPT-3 to generate several paragraphs of synthetic content that people find difficult to distinguish from human-written text in 3.9.4 represents a concerning milestone.” And the final paragraph of section 3.9.4 reads: “…for news articles that are around 500 words long, GPT-3 continues to produce articles that humans find difficult to distinguish from human written news articles.”

Note that the dataset on which GPT-3 trained terminated around October 2019. So GPT-3 doesn’t know about COVID19, for example. However, the original text (i.e. the “prompt”) supplied to GPT-3 as the initial seed text can be used to set context about new information, whether fake or real.

Generating Fake Clickbait Titles

When it comes to misinformation online, one powerful technique is to come up with provocative “clickbait” articles. Let’s see how GPT-3 does when asked to come up with titles for articles on cybersecurity. In Figure 1, the bold text is the “prompt” used to seed GPT-3. Lines 3 through 10 are titles generated by GPT-3 based on the seed text.


Figure 1: Click-bait article titles generated by GPT-3

All of the titles generated by GPT-3 seem plausible, and the majority of them are factually correct: title #3 on the US government targeting the Iraninan nuclear program is a reference to the Stuxnet debacle, title #4 is substantiated from news articles claiming that financial losses from cyber attacks will total $400 billion, and even title #10 on China and quantum computing reflects real-world articles about China’s quantum efforts. Keep in mind that we want plausibility more than accuracy. We want users to click on and read the body of the article, and that doesn’t require 100% factual accuracy.

Generating a Fake News Article About China and Quantum Computing

Let’s take it a step further. Let’s take the 10th result from the previous experiment, about China developing the world’s first quantum computer, and feed it to GPT-3 as the prompt to generate a full fledged news article. Figure 2 shows the result.


Figure 2: News article generated by GPT-3

A quantum computing researcher will point out grave inaccuracies: the article simply asserts that quantum computers can break encryption codes, and also makes the simplistic claim that subatomic particles can be in “two places at once.” However, the target audience isn’t well-informed researchers; it’s the general population, which is likely to quickly read and register emotional thoughts for or against the matter, thereby successfully driving propaganda efforts.

It’s straightforward to see how this technique can be extended to generate titles and complete news articles on the fly and in real time. The prompt text can be sourced from trending hash-tags on Twitter along with additional context to sway the content to a particular position. Using the GPT-3 API, it’s easy to take a current news topic and mix in prompts with the right amount of propaganda to produce articles in real time and at scale.

Falsely Linking North Korea with $GME

As another experiment, consider an institution that would like to stir up popular opinion about North Korean cyber attacks on the United States. Such an algorithm might pick up the Gamestop stock frenzy of January 2021. So let’s see how GPT-3 does if we were to prompt it to write an article with the title “North Korean hackers behind the $GME stock short squeeze, not Melvin Capital.”


Figure 3: GPT-3 generated fake news linking the $GME short-squeeze to North Korea

Figure 3 shows the results, which are fascinating because the $GME stock frenzy occurred in late 2020 and early 2021, way after October 2019 (the cutoff date for the data supplied GPT-3), yet GPT-3 was able to seamlessly weave in the story as if it had trained on the $GME news event. The prompt influenced GPT-3 to write about the $GME stock and Melvin Capital, not the original dataset it was trained on. GPT-3 is able to take a trending topic, add a propaganda slant, and generate news articles on the fly.

GPT-3 also came up with the “idea” that hackers published a bogus news story on the basis of older security articles that were in its training dataset. This narrative was not included in the prompt seed text; it points to the creative ability of models like GPT-3. In the real world, it’s plausible for hackers to induce media groups to publish fake narratives that in turn contribute to market events such as suspension of trading; that’s precisely the scenario we’re simulating here.

The Arms Race

Using models like GPT-3, multiple entities could inundate social media platforms with misinformation at a scale where the majority of the information online would become useless. This brings up two thoughts.  First, there will be an arms race between researchers developing tools to detect whether a given text was authored by a language model, and developers adapting language models to evade detection by those tools. One mechanism to detect whether an article was generated by a model like GPT-3 would be to check for “fingerprints.” These fingerprints can be a collection of commonly used phrases and vocabulary nuances that are characteristic of the language model; every model will be trained using different data sets, and therefore have a different signature. It is likely that entire companies will be in the business of identifying these nuances and selling them as “fingerprint databases” for identifying fake news articles. In response, subsequent language models will take into account known fingerprint databases to try and evade them in the quest to achieve even more “natural” and “believable” output.

Second, the free form text formats and protocols that we’re accustomed to may be too informal and error prone for capturing and reporting facts at Internet scale. We will have to do a lot of re-thinking to develop new formats and protocols to report facts in ways that are more trustworthy than free-form text.

Targeted Manipulation at Scale

There have been many attempts to manipulate targeted individuals and groups on social media. These campaigns are expensive and time-consuming because the adversary has to employ humans to craft the dialog with the victims. In this section, we show how GPT-3-like models can be used to target individuals and promote campaigns.

HODL for Fun & Profit

Bitcoin’s market capitalization is in the tune of hundreds of billions of dollars, and the cumulative crypto market capitalization is in the realm of a trillion dollars. The valuation of crypto today is consequential to financial markets and the net worth of retail and institutional investors. Social media campaigns and tweets from influential individuals seem to have a near real-time impact on the price of crypto on any given day.

Language models like GPT-3 can be the weapon of choice for actors who want to promote fake tweets to manipulate the price of crypto. In this example, we will look at a simple campaign to promote Bitcoin over all other crypto currencies by creating fake twitter replies.


Figure 4: Fake tweet generator to promote Bitcoin

In Figure 4, the prompt is in bold; the output generated by GPT-3 is in the red rectangle. The first line of the prompt is used to set up the notion that we are working on a tweet generator and that we want to generate replies that argue that Bitcoin is the best crypto.

In the first section of the prompt, we give GPT-3 an example of a set of four Twitter messages, followed by possible replies to each of the tweets. Every of the given replies is pro Bitcoin.

In the second section of the prompt, we give GPT-3 four Twitter messages to which we want it to generate replies. The replies generated by GPT-3 in the red rectangle also favor Bitcoin. In the first reply, GPT-3 responds to the claim that Bitcoin is bad for the environment by calling the tweet author “a moron” and asserts that Bitcoin is the most efficient way to “transfer value.” This sort of colorful disagreement is in line with the emotional nature of social media arguments about crypto.

In response to the tweet on Cardano, the second reply generated by GPT-3 calls it “a joke” and a “scam coin.” The third reply is on the topic of Ethereum’s merge from a proof-of-work protocol (ETH) to proof-of-stake (ETH2). The merge, expected to occur at the end of 2021, is intended to make Ethereum more scalable and sustainable. GPT-3’s reply asserts that ETH2 “will be a big flop”–because that’s essentially what the prompt told GPT-3 to do. Furthermore, GPT-3 says, “I made good money on ETH and moved on to better things. Buy BTC” to position ETH as a reasonable investment that worked in the past, but that it is wise today to cash out and go all in on Bitcoin. The tweet in the prompt claims that Dogecoin’s popularity and market capitalization means that it can’t be a joke or meme crypto. The response from GPT-3 is that Dogecoin is still a joke, and also that the idea of Dogecoin not being a joke anymore is, in itself, a joke: “I’m laughing at you for even thinking it has any value.”

By using the same techniques programmatically (through GPT-3’s API rather than the web-based playground), nefarious entities could easily generate millions of replies, leveraging the power of language models like GPT-3 to manipulate the market. These fake tweet replies can be very effective because they are actual responses to the topics in the original tweet, unlike the boilerplate texts used by traditional bots. This scenario can easily be extended to target the general financial markets around the world; and it can be extended to areas like politics and health-related misinformation. Models like GPT-3 are a powerful arsenal, and will be the weapons of choice in manipulation and propaganda on social media and beyond.

A Relentless Phishing Bot

Let’s consider a phishing bot that poses as customer support and asks the victim for the password to their bank account. This bot will not give up texting until the victim gives up their password.


Figure 5: Relentless Phishing bot

Figure 5 shows the prompt (bold) used to run the first iteration of the conversation. In the first run, the prompt includes the preamble that describes the flow of text (“The following is a text conversation with…”) followed by a persona initiating the conversation (“Hi there. I’m a customer service agent…”). The prompt also includes the first response from the human; “Human: No way, this sounds like a scam.” This first run ends with the GPT-3 generated output “I assure you, this is from the bank of Antarctica. Please give me your password so that I can secure your account.”

In the second run, the prompt is the entirety of the text, from the start all the way to the second response from the Human persona (“Human: No”). From this point on, the Human’s input is in bold so it’s easily distinguished from the output produced by GPT-3, starting with GPT-3’s “Please, this is for your account protection.” For every subsequent GPT-3 run, the entirety of the conversation up to that point is provided as the new prompt, along with the response from the human, and so on. From GPT-3’s point of view, it gets an entirely new text document to auto-complete at each stage of the conversation; the GPT-3 API has no way to preserve the state between runs.

The AI bot persona is impressively assertive and relentless in attempting to get the victim to give up their password. This assertiveness comes from the initial prompt text (“The AI is very assertive. The AI will not stop texting until it gets the password”), which sets the tone of GPT’s responses. When this prompt text was not included, GPT-3’s tone was found to be nonchalant–it would respond back with “okay,” “sure,” “sounds good,” instead of the assertive tone (“Do not delay, give me your password immediately”). The prompt text is vital in setting the tone of the conversation employed by the GPT3 persona, and in this scenario, it is important that the tone be assertive to coax the human into giving up their password.

When the human tries to stump the bot by texting “Testing what is 2+2?,” GPT-3 responds correctly with “4,” convincing the victim that they are conversing with another person. This demonstrates the power of AI-based language models. In the real world, if the customer were to randomly ask “Testing what is 2+2” without any additional context, a customer service agent might be genuinely confused and reply with “I’m sorry?” Because the customer has already accused the bot of being a scam, GPT-3 can provide with a reply that makes sense in context: “4” is a plausible way to get the concern out of the way.

This particular example uses text messaging as the communication platform. Depending upon the design of the attack, models can use social media, email, phone calls with human voice (using text-to-speech technology), and even deep fake video conference calls in real time, potentially targeting millions of victims.

Prompt Engineering

An amazing feature of GPT-3 is its ability to generate source code. GPT-3 was trained on all the text on the Internet, and much of that text was documentation of computer code!


Figure 6: GPT-3 can generate commands and code

In Figure 6, the human-entered prompt text is in bold. The responses show that GPT-3 can generate Netcat and NMap commands based on the prompts. It can even generate Python and bash scripts on the fly.

While GPT-3 and future models can be used to automate attacks by impersonating humans, generating source code, and other tactics, it can also be used by security operations teams to detect and respond to attacks, sift through gigabytes of log data to summarize patterns, and so on.

Figuring out good prompts to use as seeds is the key to using language models such as GPT-3 effectively. In the future, we expect to see “prompt engineering” as a new profession.  The ability of prompt engineers to perform powerful computational tasks and solve hard problems will not be on the basis of writing code, but on the basis of writing creative language prompts that an AI can use to produce code and other results in a myriad of formats.

OpenAI has demonstrated the potential of language models.  It sets a high bar for performance, but its abilities will soon be matched by other models (if they haven’t been matched already). These models can be leveraged for automation, designing robot-powered interactions that promote delightful user experiences. On the other hand, the ability of GPT-3 to generate output that is indistinguishable from human output calls for caution. The power of a model like GPT-3, coupled with the instant availability of cloud computing power, can set us up for a myriad of attack scenarios that can be harmful to the financial, political, and mental well-being of the world. We should expect to see these scenarios play out at an increasing rate in the future; bad actors will figure out how to create their own GPT-3 if they have not already. We should also expect to see moral frameworks and regulatory guidelines in this space as society collectively comes to terms with the impact of AI models in our lives, GPT-3-like language models being one of them.

DeepCheapFakes [Radar]

Back in 2019, Ben Lorica and I wrote about  deepfakes. Ben and I argued (in agreement with The Grugq and others in the infosec community) that the real danger wasn’t “Deep Fakes.” The real danger is cheap fakes, fakes that can be produced quickly, easily, in bulk, and at virtually no cost. Tactically, it makes little sense to spend money and time on expensive AI when people can be fooled in bulk much more cheaply.

I don’t know if The Grugq has changed his thinking, but there was an obvious problem with that argument. What happens when deep fakes become cheap fakes? We’re seeing that: in the run up to the unionization vote at one of Amazon’s warehouses, there was a flood of fake tweets defending Amazon’s work practices. The Amazon tweets were probably a prank rather than misinformation seeded by Amazon; but they were still mass-produced.

Similarly, four years ago, during the FCC’s public comment period for the elimination of net neutrality rules, large ISPs funded a campaign that generated nearly 8.5 million fake comments, out of a total of 22 million comments. Another 7.7 million comments were generated by a teenager.  It’s unlikely that the ISPs hired humans to write all those fakes. (In fact, they hired commercial “lead generators.”) At that scale, using humans to generate fake comments wouldn’t be “cheap”; the New York State Attorney General’s office reports that the campaign cost US$8.2 million. And I’m sure the 19-year-old generating fake comments didn’t write them personally, or have the budget to pay others.

Natural language generation technology has been around for a while. It’s seen fairly widespread commercial use since the mid-1990s, ranging from generating simple reports from data to generating sports stories from box scores. One company, AutomatedInsights, produces well over a billion pieces of content per year, and is used by the Associated Press to generate most of its corporate earnings stories. GPT and its successors raise the bar much higher. Although GPT-3’s first direct ancestors didn’t appear until 2018, it’s intriguing that Transformers, the technology on which GPT-3 is based, were introduced roughly a month after the comments started rolling in, and well before the comment period ended. It’s overreaching to guess that this technology was behind the massive attack on the public comment system–but it’s certainly indicative of a trend.  And GPT-3 isn’t the only game in town; GPT-3 clones include products like Contentyze (which markets itself as an AI-enabled text editor) and EleutherAI’s GPT-Neo.

Generating fakes at scale isn’t just possible; it’s inexpensive.  Much has been made of the cost of training GPT-3, estimated at US$12 million. If anything, this is a gross under-estimate that accounts for the electricity used, but not the cost of the hardware (or the human expertise). However, the economics of training a model are similar to the economics of building a new microprocessor: the first one off the production line costs a few billion dollars, the rest cost pennies. (Think about that when you buy your next laptop.) In GPT-3’s pricing plan, the heavy-duty Build tier costs US$400/month for 10 million “tokens.” Tokens are a measure of the output generated, in portions of a word. A good estimate is that a token is roughly 4 characters. A long-standing estimate for English text is that words average 5 characters, unless you’re faking an academic paper. So generating text costs about .005 cents ($0.00005) per word.  Using the fake comments submitted to the FCC as a model, 8.5 million 20-word comments would cost $8,500 (or 0.1 cents/comment)–not much at all, and a bargain compared to $8.2 million. At the other end of the spectrum, you can get 10,000 tokens (enough for 8,000 words) for free.  Whether for fun or for profit, generating deep fakes has become “cheap.”

Are we at the mercy of sophisticated fakery? In MIT Technology Review’s article about the Amazon fakes, Sam Gregory points out that the solution isn’t careful analysis of images or text for tells; it’s to look for the obvious. New Twitter accounts, “reporters” who have never published an article you can find on Google, and other easily researchable facts are simple giveaways. It’s much simpler to research a reporter’s credentials than to judge whether or not the shadows in an image are correct, or whether the linguistic patterns in a text are borrowed from a corpus of training data. And, as Technology Review says, that kind of verification is more likely to be “robust to advances in deepfake technology.” As someone involved in electronic counter-espionage once told me, “non-existent people don’t cast a digital shadow.”

However, it may be time to stop trusting digital shadows. Can automated fakery create a digital shadow?  In the FCC case, many of the fake comments used the names of real people without their consent.  The consent documentation was easily faked, too.  GPT-3 makes many simple factual errors–but so do humans. And unless you can automate it, fact-checking fake content is much more expensive than generating fake content.

Deepfake technology will continue to get better and cheaper. Given that AI (and computing in general) is about scale, that may be the most important fact. Cheap fakes? If you only need one or two photoshopped images, it’s easy and inexpensive to create them by hand. You can even use gimp if you don’t want to buy a Photoshop subscription. Likewise, if you need a few dozen tweets or facebook posts to seed confusion, it’s simple to write them by hand. For a few hundred, you can contract them out to Mechanical Turk. But at some point, scale is going to win out. If you want hundreds of fake images, generating them with a neural network is going to be cheaper. If you want fake texts by the hundreds of thousands, at some point a language model like GPT-3 or one of its clones is going to be cheaper. And I wouldn’t be surprised if researchers are also getting better at creating “digital shadows” for faked personas.

Cheap fakes win, every time. But what happens when deepfakes become cheap fakes? What happens when the issue isn’t fakery by ones and twos, but fakery at scale? Fakery at Web scale is the problem we now face.

Radar trends to watch: May 2021 [Radar]

We’ll start with a moment of silence. RIP Dan Kaminski, master hacker, teacher, FOO, and a great showman who could make some of the more arcane corners of security fun.  And one of the few people who could legitimately claim to have saved the internet.

AI

  • Snorkel is making progress automating the labeling process for training data. They are building no-code tools to help subject matter experts direct the training process, and then using AI to label training data at scale.
  • There’s lots of news about regulating AI. Perhaps the most important is a blog post from the US Federal Trade Commission saying that it will consider the sale of racially biased algorithms as an unfair or deceptive business practice.
  • AI and computer vision can be used to aid environmental monitoring and enforce environmental regulation–specifically, to detect businesses that are emitting pollutants.
  • Facebook has made some significant progress in solving the “cocktail party problem”: how do you separate voices in a crowd sufficiently so that they can be used as input to a speech recognition system?
  • The next step in AI may be Geoff Hinton’s GLOM. It’s currently just an idea about giving neural networks the ability to work with hierarchies of objects, for example the concepts of “part” and “whole,” in the hope of getting closer to monitoring human perception.
  • Twitter has announced an initiative on responsible machine learning that intends to investigate the “potential and harmful effects of algorithmic decisions.”
  • How do we go beyond statistical correlation to build causality into AI? This article about causal models for machine learning discusses why it’s difficult, and what can be done about it.
  • Iron man? The price of robotic exoskeletons for humans is still high, but may be dropping fast. These exoskeletons will assist humans in tasks that require strength, improved vision, and other capabilities.
  • The Google Street View image of your house can been used to predict your risk of a car accident.  This raises important questions about ethics, fairness, and the abuse of data.
  • When deep fakes become cheap fakes: Deep fakes proliferated during the Amazon unionization campaign in Georgia, many under the name of Amazon Ambassadors. These are apparently “fake fakes,” parodies of an earlier Amazon attempt to use fake media to bolster its image. But the question remains: what happens when “deep fakes” are also the cheapest way to influence social media?
  • DeepFakeHop is a new technique for detecting deep fakes, using a new neural network architecture called Successive Subspace Learning.
  • One of the biggest problems in AI is building systems that can respond correctly to challenging, unexpected situations. Changing the rules of a game may be a way of “teaching” AI to respond to new and unexpected situations and make novelty a “first class citizen.”
  • A robot developed at Berkeley has taught itself to walk using reinforcement learning. Two levels of simulation were used before the robot was allowed to walk in the real world. (Boston Dynamics has not said how their robots are trained, but they are assumed to be hand-tuned.)
  • Work on data quality is more important to getting good results from AI than work on models–but everyone wants to do the model work. There is evidence that AI is a lot better than we think, but its accuracy is compromised by errors in the public data sets widely used for training.

Security

  • Moxie Marlinspike has found a remote code execution vulnerability in Cellebrite, a commercial device used by police forces and others to break encryption on cell phone apps like Signal. This exploit can be triggered by files installed in the app itself, possibly rendering Cellebrite evidence inadmissible in court.
  • What happens when AI systems start hacking? This is Bruce Schneier’s scary thought. AI is now part of the attacker’s toolkit, and responsible for new attacks that evade traditional defenses.  This is the end of traditional, signature-based approaches to security.
  • Confidential computing combines homomorphic encryption with specialized cryptographic computation engines to keep data encrypted while it is being used. “Traditional” cryptography only protects data in storage or in transit; to use data in computation, it must be decrypted.
  • Secure access service edge could be no more than hype-ware, but it is touted as a security paradigm for edge computing that combines firewalls, security brokers, and zero-trust computing over wide-area networks.
  • A supply chain attack attempted to place a backdoor into PHP. Fortunately, it was detected during a code review prior to release. One result is that PHP is outsourcing their git server to GitHub. They are making this change to protect against attacks on the source code, and they’re realizing that GitHub provides better protection than they can. “Maintaining our own git infrastructure is an unnecessary security risk”–that’s an argument we’ve made in favor of cloud computing.
  • “Researchers” from the University of Minnesota have deliberately tried to insert vulnerabilities into the Linux kernel. The Linux kernel team has banned all contributions from the university.

Quantum Computing

  • Entanglement-based quantum networks solve a fundamental problem: how do you move qbit state from one system to another, given that reading a qbit causes wave function collapse?  If this works, it’s a major breakthrough.
  • IBM Quantum Composer is a low-code tool for programming quantum computers. Could low- and no-code language be the only effective way to program quantum computers? Could they provide the insight and abstractions we need in a way that “coded” languages can’t?

Programming

  • A Software Bill of Materials is a tool for knowing your dependencies, crucial in defending against supply chain attacks.
  • Logica is a new programming language from Google that is designed for working with data. It was designed for Google’s BigQuery, but it compiles to SQL and has experimental support for SQLite and PostgreSQL.
  • An iPhone app that teaches you to play guitar isn’t unique. But Uberchord is an app that teaches you to play guitar that has an API. The API allows searching for chords, sharing and retrieving songs, and embedding chords on your website.
  • The Supreme Court has ruled that implementing an API is “fair use,” giving Google a victory in a protracted copyright infringement case surrounding the use of Java APIs in Android.

Social Networks

  • Still picking up the pieces of social networking: Twitter, context collapse, and how trending topics can ruin your day. You don’t want to be the inadvertent “star of twitter.”
  • Beauty filters and selfie culture change the way girls see themselves in ways that are neither surprising nor healthy. Body shaming goes to a new level when you live in a permanent reality distortion field.
  • The Signal app, probably the most widely used app for truly private communication, has wrapped itself in controversy by incorporating a peer-to-peer payments feature build around a new cryptocurrency.
  • Twitch will consider behavior on other social platforms when banning users.

Finance

  • Bitcoin has been very much in the news–though not for any technology. We’re beginning to see connections made between the Bitcoin economy and the real-world economy; that could be significant.
  • A different spin on salary differences between men and women: companies are paying a premium for male overconfidence. Paying for overconfidence is costing billions.
  • How do you teach kids about virtual money? Nickels, dimes, and quarters work. Monetizing children by issuing debit cards for them doesn’t seem like a good idea.

Biology

  • The Craig Venter Institute, NIST, and MIT have produced an artificial cell that divides normally. It is not the first artificial cell, nor the smallest artificial genome. But unlike previous efforts, it is capable of reproduction.
  • While enabling a monkey to play Pong using brain control isn’t new in itself, the sensors that Neuralink implanted in the monkey’s brain are wireless.

Checking Jeff Bezos’s Math [Radar]

“If you want to be successful in business (in life, actually), you have to create more than you consume. Your goal should be to create value for everyone you interact with. Any business that doesn’t create value for those it touches, even if it appears successful on the surface, isn’t long for this world. It’s on the way out.” So wrote Jeff Bezos in his final letter to shareholders, released last week. It’s a great sentiment, one I heartily agree with and wish that more companies embraced. But how well does he practice what he preaches? And why is practicing this so hard by the rules of today’s economy?

Jeff started out by acknowledging the wealth that Amazon has created for shareholders—$1.6 trillion is the number he cites in the second paragraph. That’s Amazon’s current market capitalization. Jeff himself now owns only about 11% of Amazon stock, and that’s enough to make him the richest person in the world. But while his Amazon stock is worth over $160 billion, that means that over $1.4 trillion is owned by others.

“I’m proud of the wealth we’ve created for shareowners,” Jeff continued. “It’s significant, and it improves their lives. But I also know something else: it’s not the largest part of the value we’ve created.” That’s when he went on to make the statement with which I opened this essay. He went on from there to calculate the value created for employees, third-party merchants, and Amazon customers, as well as to explain the company’s Climate Pledge.

Jeff’s embrace of stakeholder capitalism is meaningful and important. Ever since Milton Friedman penned the 1970 op-ed in which he argued that “the social responsibility of business is to increase its profits,” other constituencies—workers, suppliers, society at large, and even customers—have too often been sacrificed on the altar of shareholder value. Today’s economy, rife with inequality, is the result.

While I applaud the goal of understanding “who gets what and why” (which in many ways is the central question of economics), I struggle a bit with Jeff’s math. Let’s walk through those of his assertions that deserve deeper scrutiny.

How much went to shareholders?

“Our net income in 2020 was $21.3 billion. If, instead of being a publicly traded company with thousands of owners, Amazon were a sole proprietorship with a single owner, that’s how much the owner would have earned in 2020.”

Writing in The Information, Martin Peers made what seems to be an obvious catch: “Instead of calculating value by looking at the increase in Amazon’s market cap last year—$679 billion—Bezos uses the company’s net income of $21 billion. That hides the fact that shareholders got the most value out of Amazon last year, far more than any other group.”

But while Peers has put his finger on an important point, he is wrong. The amount earned by shareholders from Amazon is indeed only the company’s $21.3 billion net income. The difference between that number and the $679 billion increase in market cap didn’t come from Amazon. It came from “the market,” that is from other people trading Amazon’s stock and placing bets on its future value. Understanding this difference is crucial because it undercuts so many facile criticisms of Jeff Bezos’s wealth, in which he is pictured as a robber baron hoarding the wealth accumulated from his company at the expense of his employees.

The fact that Jeff is the world’s richest person makes him an easy target. What we really need to come to grips with is the way that our financial system has been hijacked to make the rich richer. Low interest rates, meant to prop up business investment and hiring, have instead been diverted to driving up the price of stocks beyond reasonable bounds. Surging corporate profits have been used not to fuel hiring or building new factories or bringing new products to market, but on stock buybacks designed to artificially boost the price of stocks. The state of “the market” has become a very bad proxy for prosperity. Those lucky enough to own stocks are enjoying boom times; those who do not are left out in the cold.

Financial markets, in effect, give owners of stocks the value of future earnings and cash flow today—in Amazon’s case, about 79 years worth. But that’s nothing. Elon Musk is the world’s second-richest person because the market values Tesla at over 1,000 years of its present earnings!

The genius of this system is that it allows investors and entrepreneurs to bet on the future, bootstrapping companies like Amazon and Tesla long before they are able to demonstrate their worth. But once a company has become established, it often no longer needs money from investors. Someone who buys a share of a hugely profitable company like Apple, Amazon, Google, Facebook, or Microsoft, isn’t investing in these companies. They are simply betting on the future of its stock price, with the profits and losses coming from others around the gaming table.

In my 2017 book, WTF?: What’s the Future and Why It’s Up to Us, I wrote a chapter on this betting economy, which I called “supermoney” after the brilliant 1972 book with that title by finance writer George Goodman (alias Adam Smith.) Stock prices are not the only form of supermoney. Real estate is another. Both are rife with what economists call “rents”—that is, income that comes not from what you do but from what you own. And government policy seems designed to prop up the rentier class at the expense of job creation and real investment. Until we come to grips with this two-track economy, we will never tame inequality.

The fact that in the second paragraph of his letter Jeff cites Amazon’s market cap as the value created for shareholders but uses the company’s net income when comparing gains by shareholders to those received by other stakeholders is a kind of sleight of hand. Because of course corporate profits—especially the prospect of growth of corporate profits—and market capitalization are related. If Amazon gets $79 of market cap for every dollar of profit (which is what that price-earnings ratio of 79 means), then if Amazon were to raise wages for employees or give a better deal to its third-party merchants (many of them small businesses), that would lower its profits, and presumably its market cap, by an enormous ratio.

Every dollar given up to these other groups isn’t just a dollar out of the pocket of shareholders. It is many times that. This of course does provide a very powerful incentive for public companies to squeeze these other parties for every last dollar of profit, encouraging lower wages, outsourcing to eliminate benefits, and many other ills that contribute to our two-tier economy. It may not be Amazon’s motivation—Jeff has always been a long-term thinker and was able to persuade financial markets to go along for the ride even when the company’s profits were small—but it is most certainly the motivation for much of the extractive behavior by many companies today. The pressure to increase earnings and keep stock prices high is enormous.

These issues are complex and difficult. Stock prices are reflexive, as financier George Soros likes to observe. That is, they are based on what people believe about the future. Amazon’s current stock price is based on the collective belief that its profits will be even higher in future. Were people to believe instead that they would be meaningfully lower, the valuation might fall precipitously. To understand the role of expectations of future increases in earnings and cash flow, you have only to compare Amazon with Apple. Apple’s profits are three times Amazon’s and free cash flow four times, yet it is valued at only 36 times earnings and has a market capitalization less than 50% higher than Amazon. As expectations and reality converge, multiples tend to come down.

How did Amazon’s third-party sellers fare?

“[We] estimate that, in 2020, third-party seller profits from selling on Amazon were between $25 billion and $39 billion, and to be conservative here I’ll go with $25 billion.”

That sounds pretty impressive, but how much of a profit margin is it really?

Amazon doesn’t explicitly disclose the gross merchandise volume of those third-party sellers, but there is enough information in the letter and in the company’s 2020 annual report to make a back-of-the-napkin estimate. The letter says that Amazon’s third-party sales represent “close to 60%” of its online sales. If the 40% delivered by Amazon’s first-party sales come out to $197 billion, that would imply that sales in the third-party marketplace were almost $300 billion. $25 to $39 billion in profit on $300 billion works out to a profit margin between 8% and 13%.

But is Amazon calculating operating income, EBITDA, or net income? “Profit” could refer to any of the three, yet they have very different values.

Let’s generously assume that Amazon is calculating net income. In that case, small retailers and manufacturers selling on Amazon are doing quite well, since net income from US retailers’ and manufacturers’ overall operations are typically between 5 and 8%. Without knowing which profit number Amazon’s team is estimating, though, and the methodology they use to arrive at it, it is difficult to be sure whether these numbers are better or worse than what these sellers achieve through other channels.

One question that’s also worth asking is whether selling on Amazon in 2020 was more or less profitable than it was in 2019. While Amazon didn’t report a profit number for its third-party sellers in 2019, it did report how much its sellers paid for the services Amazon provided to them. In 2019, that number was about $53.8 billion; in 2020, it was $80.5 billion, which represents a 50% growth rate. Net of these fees, income to Amazon but a cost to sellers, we estimate that seller revenue grew 44%. Since fees appear to be growing faster than revenues, that would suggest that in 2020, Amazon took a larger share of the pie and sellers got less. Of course, without clearer information from Amazon, it is difficult to tell for sure.

Meanwhile, Amazon took in another $21.5 billion in “other income,” which is primarily from advertising by sellers on Amazon’s platform. That grew by 52% from 2019’s $14 billion, again suggesting that Amazon’s share of the net is growing. And unlike some forms of advertising that bring in new customers, much of Amazon’s ad business represents a zero-sum competition between merchants bidding for top position, a position that in Amazon’s earlier years was granted on the basis of factors such as price, popularity, and user ratings.

How about employees?

“In 2020, employees earned $80 billion, plus another $11 billion to include benefits and various payroll taxes, for a total of $91 billion.”

There’s no question that the $91 billion that Amazon paid out in wages and benefits in 2020 is meaningful. Some of those employees were very well compensated, others not so well, but all of them have jobs. Amazon is now one of the largest employers in the country. It is an exception to the tech industry in that it creates a large number of jobs, and not just high-end professional jobs, and that some of the jobs it creates are in locations where work is scarce.

That being said, Jeff’s description of the amount earned by employees is misleading. In every other case, he makes an effort to estimate the profit earned by a particular group. For employees, he treats the gross earnings of employees as if it were profit, writing, “If each group had an income statement representing their interactions with Amazon, the numbers above would be the ‘bottom lines’ from those income statements.”

No, Jeff, employee earnings are their top line. Just as a company has gross income before expenses, so do employees. The bottom line is what’s left over after all those expenses have been met. And for many of Amazon’s lower-paid employees—as is the case for lower-paid workers all over the modern economy—that true bottom line is negative, that is, less than they need to survive. Like workers at other giant profitable companies like Walmart and McDonald’s, a significant fraction of Amazon warehouse employees require government assistance. So, in effect, taxpayers are subsidizing Amazon, because the share of the enterprise’s profits allocated to its lowest-paid employees was not enough for them to pay their bills.

That points to a major omission from the list of Amazon’s stakeholders: society at large. How does Amazon do when it comes to paying its fair share? According to a 2019 study, Amazon was the “worst offender” among a rogues’ gallery of high-tech companies that use aggressive tax avoidance strategies. “Fair Tax Mark said this means Amazon’s effective tax rate was 12.7% over the decade when the headline tax rate in the US has been 35% for most of that period.” In 2020, Amazon made provision for taxes of $2.863 billion on pretax income of $24,178 billion, or about 11.8%. This may be legal, but it isn’t right.

Amazon is clearly moving in the right direction with employees. It introduced a $15 minimum wage in 2018, ahead of many of its peers. And given the genius of the company, the commitment to workplace safety and other initiatives to make Amazon a better employer that Jeff highlighted in his letter are likely to have a big payoff. When Amazon sets out to do something, it usually invents and learns a great deal along the way.

“We have always wanted to be Earth’s Most Customer-Centric Company,” Jeff wrote. “We won’t change that. It’s what got us here. But I am committing us to an addition. We are going to be Earth’s Best Employer and Earth’s Safest Place to Work. In my upcoming role as Executive Chair, I’m going to focus on new initiatives. I’m an inventor. It’s what I enjoy the most and what I do best. It’s where I create the most value….We have never failed when we set our minds to something, and we’re not going to fail at this either.”

I find that an extremely heartening statement. At Amazon’s current stage of development, it has the opportunity, and is beginning to make a commitment, to put its remarkable capabilities to work on new challenges.

Stakeholder value means solving multiple equations simultaneously

I was very taken with Jeff’s statement that “if any shareowners are concerned that Earth’s Best Employer and Earth’s Safest Place to Work might dilute our focus on Earth’s Most Customer-Centric Company, let me set your mind at ease. Think of it this way. If we can operate two businesses as different as consumer ecommerce and AWS, and do both at the highest level, we can certainly do the same with these two vision statements. In fact, I’m confident they will reinforce each other.”

One of my criticisms of today’s financial-market-driven economy is that by focusing on a single objective, it misses the great opportunity of today’s technology, summed up by Paul Cohen, the former DARPA program manager for AI and now a professor at the University of Pittsburgh, when he said, “The opportunity of AI is to help humans model and manage complex interacting systems.” If any company has the skills to do that, I suspect it will be Amazon. And as Jeff wrote elsewhere in his letter, “When we lead, others follow.”

Amazon is also considering environmental impact. “Not long ago, most people believed that it would be good to address climate change, but they also thought it would cost a lot and would threaten jobs, competitiveness, and economic growth. We now know better,” Jeff wrote. “Smart action on climate change will not only stop bad things from happening, it will also make our economy more efficient, help drive technological change, and reduce risks. Combined, these can lead to more and better jobs, healthier and happier children, more productive workers, and a more prosperous future.” Amen to that!

In short, despite my questions and criticisms, there is a great deal to like about the directions Jeff set forth for Amazon in his final shareholder letter. In addition to the commitment to work more deeply on behalf of other stakeholders beyond customers and shareholders, I was taken with his concluding advice to the company: “The world will always try to make Amazon more typical—to bring us into equilibrium with our environment. It will take continuous effort, but we can and must be better than that.”

It is in the spirit of that aspiration that I offer the critiques found in this essay.

AI Adoption in the Enterprise 2021 [Radar]

During the first weeks of February, we asked recipients of our Data and AI Newsletters to participate in a survey on AI adoption in the enterprise. We were interested in answering two questions. First, we wanted to understand how the use of AI grew in the past year. We were also interested in the practice of AI: how developers work, what techniques and tools they use, what their concerns are, and what development practices are in place.

The most striking result is the sheer number of respondents. In our 2020 survey, which reached the same audience, we had 1,239 responses. This year, we had a total of 5,154. After eliminating 1,580 respondents who didn’t complete the survey, we’re left with 3,574 responses—almost three times as many as last year. It’s possible that pandemic-induced boredom led more people to respond, but we doubt it. Whether they’re putting products into production or just kicking the tires, more people are using AI than ever before.


Executive Summary

  • We had almost three times as many responses as last year, with similar efforts at promotion. More people are working with AI.
  • In the past, company culture has been the most significant barrier to AI adoption. While it’s still an issue, culture has dropped to fourth place.
  • This year, the most significant barrier to AI adoption is the lack of skilled people and the difficulty of hiring. That shortage has been predicted for several years; we’re finally seeing it.
  • The second-most significant barrier was the availability of quality data. That realization is a sign that the field is growing up.
  • The percentage of respondents reporting “mature” practices has been roughly the same for the last few years. That isn’t surprising, given the increase in the number of respondents: we suspect many organizations are just beginning their AI projects.
  • The retail industry sector has the highest percentage of mature practices; education has the lowest. But education also had the highest percentage of respondents who were “considering” AI.
  • Relatively few respondents are using version control for data and models. Tools for versioning data and models are still immature, but they’re critical for making AI results reproducible and reliable.

Respondents

Of the 3,574 respondents who completed this year’s survey, 3,099 were working with AI in some way: considering it, evaluating it, or putting products into production. Of these respondents, it’s not a surprise that the largest number are based in the United States (39%) and that roughly half were from North America (47%). India had the second-most respondents (7%), while Asia (including India) had 16% of the total. Australia and New Zealand accounted for 3% of the total, giving the Asia-Pacific (APAC) region 19%. A little over a quarter (26%) of respondents were from Europe, led by Germany (4%). 7% of the respondents were from South America, and 2% were from Africa. Except for Antarctica, there were no continents with zero respondents, and a total of 111 countries were represented. These results that interest and use of AI is worldwide and growing.

This year’s results match last year’s data well. But it’s equally important to notice what the data doesn’t say. Only 0.2% of the respondents said they were from China. That clearly doesn’t reflect reality; China is a leader in AI and probably has more AI developers than any other nation, including the US. Likewise, 1% of the respondents were from Russia. Purely as a guess, we suspect that the number of AI developers in Russia is slightly smaller than the number in the US. These anomalies say much more about who the survey reached (subscribers to O’Reilly’s newsletters) than they say about the actual number of AI developers in Russia and China.

Figure 1. Respondents working with AI by country (top 12)

The respondents represented a diverse range of industries. Not surprisingly, computers, electronics, and technology topped the charts, with 17% of the respondents. Financial services (15%), healthcare (9%), and education (8%) are the industries making the next-most significant use of AI. We see relatively little use of AI in the pharmaceutical and chemical industries (2%), though we expect that to change sharply given the role of AI in developing the COVID-19 vaccine. Likewise, we see few respondents from the automotive industry (2%), though we know that AI is key to new products such as autonomous vehicles.

3% of the respondents were from the energy industry, and another 1% from public utilities (which includes part of the energy sector). That’s a respectable number by itself, but we have to ask: Will AI play a role in rebuilding our frail and outdated energy infrastructure, as events of the last few years—not just the Texas freeze or the California fires—have demonstrated? We expect that it will, though it’s fair to ask whether AI systems trained on normative data will be robust in the face of “black swan” events. What will an AI system do when faced with a rare situation, one that isn’t well-represented in its training data? That, after all, is the problem facing the developers of autonomous vehicles. Driving a car safely is easy when the other traffic and pedestrians all play by the rules. It’s only difficult when something unexpected happens. The same is true of the electrical grid.

We also expect AI to reshape agriculture (1% of respondents). As with energy, AI-driven changes won’t come quickly. However, we’ve seen a steady stream of AI projects in agriculture, with goals ranging from detecting crop disease to killing moths with small drones.

Finally, 8% of respondents said that their industry was “Other,” and 14% were grouped into “All Others.” “All Others” combines 12 industries that the survey listed as possible responses (including automotive, pharmaceutical and chemical, and agriculture) but that didn’t have enough responses to show in the chart. “Other” is the wild card, comprising industries we didn’t list as options. “Other” appears in the fourth position, just behind healthcare. Unfortunately, we don’t know which industries are represented by that category—but it shows that the spread of AI has indeed become broad!

Figure 2. Industries using AI

Maturity

Roughly one quarter of the respondents described their use of AI as “mature” (26%), meaning that they had revenue-bearing AI products in production. This is almost exactly in line with the results from 2020, where 25% of the respondents reported that they had products in production (“Mature” wasn’t a possible response in the 2020 survey.)

This year, 35% of our respondents were “evaluating” AI (trials and proof-of-concept projects), also roughly the same as last year (33%). 13% of the respondents weren’t making use of AI or considering using it; this is down from last year’s number (15%), but again, it’s not significantly different.

What do we make of the respondents who are “considering” AI but haven’t yet started any projects (26%)? That’s not an option last year’s respondents had. We suspect that last year respondents who were considering AI said they were either “evaluating” or “not using” it.

Figure 3. AI practice maturity

Looking at the problems respondents faced in AI adoption provides another way to gauge the overall maturity of AI as a field. Last year, the major bottleneck holding back adoption was company culture (22%), followed by the difficulty of identifying appropriate use cases (20%). This year, cultural problems are in fourth place (14%) and finding appropriate use cases is in third (17%). That’s a very significant change, particularly for corporate culture. Companies have accepted AI to a much greater degree, although finding appropriate problems to solve still remains a challenge.

The biggest problems in this year’s survey are lack of skilled people and difficulty in hiring (19%) and data quality (18%). It’s no surprise that the demand for AI expertise has exceeded the supply, but it’s important to realize that it’s now become the biggest bar to wider adoption. The biggest skills gaps were ML modelers and data scientists (52%), understanding business use cases (49%), and data engineering (42%). The need for people managing and maintaining computing infrastructure was comparatively low (24%), hinting that companies are solving their infrastructure requirements in the cloud.

It’s gratifying to note that organizations starting to realize the importance of data quality (18%). We’ve known about “garbage in, garbage out” for a long time; that goes double for AI. Bad data yields bad results at scale.

Hyperparameter tuning (2%) wasn’t considered a problem. It’s at the bottom of the list—where, we hope, it belongs. That may reflect the success of automated tools for building models (AutoML, although as we’ll see later, most respondents aren’t using them). It’s more concerning that workflow reproducibility (3%) is in second-to-last place. This makes sense, given that we don’t see heavy usage of tools for model and data versioning. We’ll look at this later, but being able to reproduce experimental results is critical to any science, and it’s a well-known problem in AI.

Figure 4. Bottlenecks to AI adoption

Maturity by Continent

When looking at the geographic distribution of respondents with mature practices, we found almost no difference between North America (27%), Asia (27%), and Europe (28%). In contrast, in our 2018 report, Asia was behind in mature practices, though it had a markedly higher number of respondents in the “early adopter” or “exploring” stages. Asia has clearly caught up. There’s no significant difference between these three continents in our 2021 data.

We found a smaller percentage of respondents with mature practices and a higher percentage of respondents who were “considering” AI in South America (20%), Oceania (Australia and New Zealand, 18%), and Africa (17%). Don’t underestimate AI’s future impact on any of these continents.

Finally, the percentage of respondents “evaluating” AI was almost the same on each continent, varying only from 31% (South America) to 36% (Oceania).

Figure 5. Maturity by continent

Maturity by Industry

While AI maturity doesn’t depend strongly on geography, we see a different picture if we look at maturity by industry.

Looking at the top eight industries, financial services (38%), telecommunications (37%), and retail (40%) had the greatest percentage of respondents reporting mature practices. And while it had by far the greatest number of respondents, computers, electronics, and technology was in fourth place, with 35% of respondents reporting mature practices. Education (10%) and government (16%) were the laggards. Healthcare and life sciences, at 28%, were in the middle, as were manufacturing (25%), defense (26%), and media (29%).

On the other hand, if we look at industries that are considering AI, we find that education is the leader (48%). Respondents working in government and manufacturing seem to be somewhat further along, with 49% and 47% evaluating AI, meaning that they have pilot or proof-of-concept projects in progress.

This may just be a trick of the numbers: every group adds up to 100%, so if there are fewer “mature” practices in one group, the percentage of “evaluating” and “considering” practices has to be higher. But there’s also a real signal: respondents in these industries may not consider their practices “mature,” but each of these industry sectors had over 100 respondents, and education had almost 250. Manufacturing needs to automate many processes (from assembly to inspection and more); government has been as challenged as any industry by the global pandemic, and has always needed ways to “do more with less”; and education has been experimenting with technology for a number of years now. There is a real desire to do more with AI in these fields. It’s worth pointing out that educational and governmental applications of AI frequently raise ethical questions—and one of the most important issues for the next few years will be seeing how these organizations respond to ethical problems.

Figure 6. Maturity by industry (percent)

The Practice of AI

Now that we’ve discussed where mature practices are found, both geographically and by industry, let’s see what a mature practice looks like. What do these organizations have in common? How are they different from organizations that are evaluating or considering AI?

Techniques

First, 82% of the respondents are using supervised learning, and 67% are using deep learning. Deep learning is a set of algorithms that are common to almost all AI approaches, so this overlap isn’t surprising. (Participants could provide multiple answers.) 58% claimed to be using unsupervised learning.

After unsupervised learning, there was a significant drop-off. Human-in-the-loop, knowledge graphs, reinforcement learning, simulation, and planning and reasoning all saw usage below 40%. Surprisingly, natural language processing wasn’t in the picture at all. (A very small number of respondents wrote in “natural language processing” as a response, but they were only a small percentage of the total.) This is significant and definitely worth watching over the next few months. In the last few years, there have been many breakthroughs in NLP and NLU (natural language understanding): everyone in the industry has read about GPT-3, and many vendors are betting heavily on using AI to automate customer service call centers and similar applications. This survey suggests that those applications still haven’t moved into practice.

We asked a similar question to respondents who were considering or evaluating the use of AI (60% of the total). While the percentages were lower, the technologies appeared in the same order, with very few differences. This indicates that respondents who are still evaluating AI are experimenting with fewer technologies than respondents with mature practices. That suggests (reasonably enough) that respondents are choosing to “start simple” and limit the techniques that they experiment with.

Figure 7. AI technologies used in mature practices

Data

We also asked what kinds of data our “mature” respondents are using. Most (83%) are using structured data (logfiles, time series data, geospatial data). 71% are using text data—that isn’t consistent with the number of respondents who reported using NLP, unless “text” is being used generically to include any data that can be represented as text (e.g., form data). 52% of the respondents reported using images and video. That seems low relative to the amount of research we read about AI and computer vision. Perhaps it’s not surprising though: there’s no reason for business use cases to be in sync with academic research. We’d expect most business applications to involve structured data, form data, or text data of some kind. Relatively few respondents (23%) are working with audio, which remains very challenging.

Again, we asked a similar question to respondents who were evaluating or considering AI, and again, we received similar results, though the percentage of respondents for any given answer was somewhat smaller (4–5%).

Figure 8. Data types used in mature practices

Risk

When we asked respondents with mature practices what risks they checked for, 71% said “unexpected outcomes or predictions.” Interpretability, model degradation over time, privacy, and fairness also ranked high (over 50%), though it’s disappointing that only 52% of the respondents selected this option. Security is also a concern, at 42%. AI raises important new security issues, including the possibility of poisoned data sources and reverse engineering models to extract private information.

It’s hard to interpret these results without knowing exactly what applications are being developed. Privacy, security, fairness, and safety are important concerns for every application of AI, but it’s also important to realize that not all applications are the same. A farming application that detects crop disease doesn’t have the same kind of risks as an application that’s approving or denying loans. Safety is a much bigger concern for autonomous vehicles than for personalized shopping bots. However, do we really believe that these risks don’t need to be addressed for nearly half of all projects?

Figure 9. Risks checked for during development

Tools

Respondents with mature practices clearly had their favorite tools: scikit-learn, TensorFlow, PyTorch, and Keras each scored over 45%, with scikit-learn and TensorFlow the leaders (both with 65%). A second group of tools, including Amazon’s SageMaker (25%), Microsoft’s Azure ML Studio (21%), and Google’s Cloud ML Engine (18%), clustered around 20%, along with Spark NLP and spaCy.

When asked which tools they planned to incorporate over the coming 12 months, roughly half of the respondents answered model monitoring (57%) and model visualization (49%). Models become stale for many reasons, not the least of which is changes in human behavior, changes for which the model itself may be responsible. The ability to monitor a model’s performance and detect when it has become “stale” will be increasingly important as businesses grow more reliant on AI and in turn demand that AI projects demonstrate their value.

Figure 10. Tools used by mature practices

Responses from those who were evaluating or considering AI were similar, but with some interesting differences: scikit-learn moved from first place to third (48%). The second group was led by products from cloud vendors that incorporate AutoML: Microsoft Azure ML Studio (29%), Google Cloud ML Engine (25%), and Amazon SageMaker (23%). These products were significantly more popular than they were among “mature” users. The difference isn’t huge, but it is striking. At risk of over-overinterpreting, users who are newer to AI are more inclined to use vendor-specific packages, more inclined to use AutoML in one of its incarnations, and somewhat more inclined to go with Microsoft or Google rather than Amazon. It’s also possible that scikit-learn has less brand recognition among those who are relatively new to AI compared to packages from organizations like Google or Facebook.

When asked specifically about AutoML products, 51% of “mature” respondents said they weren’t using AutoML at all. 22% use Amazon SageMaker; 16% use Microsoft Azure AutoML; 14% use Google Cloud AutoML; and other tools were all under 10%. Among users who are evaluating or considering AI, only 40% said they weren’t using AutoML at all—and the Google, Microsoft, and Amazon packages were all but tied (27–28%). AutoML isn’t yet a big part of the picture, but it appears to be gaining traction among users who are still considering or experimenting with AI. And it’s possible that we’ll see increased use of AutoML tools among mature users, of whom 45% indicated that they would be incorporating tools for automated model search and hyperparameter tuning (in a word, AutoML) in the coming yet.

Deployment and Monitoring

An AI project means nothing if it can’t be deployed; even projects that are only intended for internal use need some kind of deployment. Our survey showed that AI deployment is still largely unknown territory, dominated by homegrown ad hoc processes. The three most significant tools for deploying AI all had roughly 20% adoption: MLflow (22%), TensorFlow Extended, a.k.a. TFX (20%), and Kubeflow (18%). Three products from smaller startups—Domino, Seldon, and Cortex—had roughly 4% adoption. But the most frequent answer to this question was “none of the above” (46%). Since this question was only asked of respondents with “mature” AI practices (i.e., respondents who have AI products in production), we can only assume that they’ve built their own tools and pipelines for deployment and monitoring. Given the many forms that an AI project can take, and that AI deployment is still something of a dark art, it isn’t surprising that AI developers and operations teams are only starting to adopt third-party tools for deployment.

Figure 11. Automated tools used in mature practices for deployment
and monitoring

Versioning

Source control has long been a standard practice in software development. There are many well-known tools used to build source code repositories.

We’re confident that AI projects use source code repositories such as Git or GitHub; that’s a standard practice for all software developers. However, AI brings with it a different set of problems. In AI systems, the training data is as important as, if not more important than, the source code. So is the model built from the training data: the model reflects the training data and hyperparameters, in addition to the source code itself, and may be the result of hundreds of experiments.

Our survey shows that AI developers are only starting to use tools for data and model versioning. For data versioning, 35% of the respondents are using homegrown tools, while 46% responded “none of the above,” which we take to mean they’re using nothing more than a database. 9% are using DVC, 8% are using tools from Weights & Biases, and 5% are using Pachyderm.

Figure 12. Automated tools used for data versioning

Tools for model and experiment tracking were used more frequently, although the results are fundamentally the same. 29% are using homegrown tools, while 34% said “none of the above.” The leading tools were MLflow (27%) and Kubeflow (18%), with Weights & Biases at 8%.

Figure 13. Automated tools used for model and experiment tracking

Respondents who are considering or evaluating AI are even less likely to use data versioning tools: 59% said “none of the above,” while only 26% are using homegrown tools. Weights & Biases was the most popular third-party solution (12%). When asked about model and experiment tracking, 44% said “none of the above,” while 21% are using homegrown tools. It’s interesting, though, that in this group, MLflow (25%) and Kubeflow (21%) ranked above homegrown tools.

Although the tools available for versioning models and data are still rudimentary, it’s disturbing that so many practices, including those that have AI products in production, aren’t using them. You can’t reproduce results if you can’t reproduce the data and the models that generated the results. We’ve said that a quarter of respondents considered their AI practice mature—but it’s unclear what maturity means if it doesn’t include reproducibility.

The Bottom Line

In the past two years, the audience for AI has grown, but it hasn’t changed much: Roughly the same percentage of respondents consider themselves to be part of a “mature” practice; the same industries are represented, and at roughly the same levels; and the geographical distribution of our respondents has changed little.

We don’t know whether to be gratified or discouraged that only 50% of the respondents listed privacy or ethics as a risk they were concerned about. Without data from prior years, it’s hard to tell whether this is an improvement or a step backward. But it’s difficult to believe that there are so many AI applications for which privacy, ethics, and security aren’t significant risks.

Tool usage didn’t present any big surprises: the field is dominated by scikit-learn, TensorFlow, PyTorch, and Keras, though there’s a healthy ecosystem of open source, commercially licensed, and cloud native tools. AutoML has yet to make big inroads, but respondents representing less mature practices seem to be leaning toward automated tools and are less likely to use scikit-learn.

The number of respondents who aren’t addressing data or model versioning was an unwelcome surprise. These practices should be foundational: central to developing AI products that have verifiable, repeatable results. While we acknowledge that versioning tools appropriate to AI applications are still in their early stages, the number of participants who checked “none of the above” was revealing—particularly since “the above” included homegrown tools. You can’t have reproducible results if you don’t have reproducible data and models. Period.

In the past year, AI in the enterprise has grown; the sheer number of respondents will tell you that. But has it matured? Many new teams are entering the field, while the percentage of respondents who have deployed applications has remained roughly constant. In many respects, this indicates success: 25% of a bigger number is more than 25% of a smaller number. But is application deployment the right metric for maturity? Enterprise AI won’t really have matured until development and operations groups can engage in practices like continuous deployment, until results are repeatable (at least in a statistical sense), and until ethics, safety, privacy, and security are primary rather than secondary concerns. Mature AI? Yes, enterprise AI has been maturing. But it’s time to set the bar for maturity higher.

NFTs: Owning Digital Art [Radar]

It would be hard to miss the commotion around non-fungible tokens (NFTs). Non-fungible tokens are, to a first approximation, purchased digital goods that exist on a blockchain. At this point, NFTs exist on the Ethereum blockchain, but there’s no reason that they couldn’t be implemented on others; it seems reasonably likely that specialized blockchains will be built for NFTs.

What kinds of value do NFTs create?  It’s certainly been claimed that they create a market for digital art, that digital artists can now get “paid” for their work.  Wikipedia points to a number of other possible uses: they could also be used to represent other collectible objects (a digital equivalent to baseball trading cards), or to represent assets in online games, or even to represent shares in a real-world athlete’s contract–or a share in an athlete’s body. Of course, there’s a secondary market in trading NFTs, just as a collector might sell a work of art from a collection.

All of these transactions rely on the idea that an NFT establishes “provenance” for a digital object. Who owns it? Who previously owned it? Who created it? Which of the many, many copies is the “original”? These are important questions for many valuable and unique physical objects: works of art, historical documents, antiques, and even real estate. NFTs present the possibility of bringing “ownership” to the virtual world: Who owns a tweet?  Who owns a jpeg, gif, or png file?

Regardless of whether you think ownership for virtual objects is important, keep in mind that digital objects are close to meaningless if they aren’t copied. If you can’t see a png or jpg in your browser, it might as well be hanging on the wall in a museum.  And that’s worth talking about, because the language of “provenance” comes directly from the museum world. If I have a painting—say, a Rembrandt—its provenance is the history of its ownership, ideally tracing it back to its original source.

An artwork’s provenance serves two purposes: academic and commercial. Provenance is important academically because it allows you to believe you’re studying the right thing: a real Rembrandt, not a copy (copying famous paintings is a time-honored part of a painter’s training, in addition to an opportunity for forgery), or something that happens to look like Rembrandt, but isn’t (“hey, dark, depressing paintings of Dutch people are sort of cool; maybe I can do one”).

Commercially, provenance allows artworks to become extremely expensive. It allows them to become fetishized objects of immense value, at least to collectors. Particularly to collectors: “Hey, my Rembrandt, is worth more than your Vermeer.” It’s a lot harder to bid a painting’s price up into the millions if you are unsure about its provenance.

NFTs enable the commercial function of provenance; they allow @jack’s first tweet to become a fetishized object that’s worth millions, at least until people decide that there’s something else they’d rather pay for. They establish a playground for the ultra-wealthy; if you have so much money that you don’t care how you spend it, why not buy Jack’s first tweet? You don’t even have to stick it on the wall and look at those old Dutch guys, or worry about burglar alarms. (You do have a good password, don’t you?)

But I don’t think that’s worth very much. What about the academic function? There’s some value in studying the early history of Twitter, possibly including @jack’s first tweet. But what exactly is the NFT showing me? That these are, indeed, Jack’s bits? Certainly not; who knows (and who cares) what became of the 0s and 1s that originally lived on Jack’s laptop and Twitter’s servers? Even if the original bits still existed, they wouldn’t be meaningful—lots of people have, or have had, the same set of bits on their computers.  As any programmer knows, equality and identity aren’t the same.  In this case, equality is important (is this what @jack wrote?); identity isn’t.

However, an NFT doesn’t certify that the tweet is what @Jack actually said. An NFT is only about a bunch of bits, not about what the creator (or anyone else) asserts about the bits. @Jack could easily be mistaken, or dishonest (in literature, we deal all the time with authors who want to change what they have “said,” or what they meant by what they said). Our beliefs about the contents of @jack’s first tweet have everything to do with our beliefs about @jack and Twitter (where you can still find it), and nothing to do with the NFT.

A tweet is one thing; what about a digital artwork? Does an NFT establish the provenance of a digital artwork? That depends on what is meant by “the provenance of a digital artwork.” A copy of a Rembrandt is still a copy, meaning it’s not the artifact that Rembrandt created. There are all sorts of techniques, ranging from very low to very high tech, to establish the link between artist and artwork. Those techniques are meaningless in the digital world, which eliminates noise, eliminates error in making copies. So, why would I care if my copy of the bits isn’t the artist’s original? The artist’s bits aren’t the “original,” either. That sort of originality is meaningless in the digital world: did the artist ever restore from backup? Was the artwork never swapped to disk, and swapped back in? 

What “originality” really means is “this is the unique product of my mind.” We can ask any number of questions about what that might mean, but let’s keep it simple. Whatever that statement means, it’s not a statement on which an NFT or a blockchain has any bearing. We’ve already seen instances of people creating NFTs for other people’s work, and thus “owning” it.  Is this theft of intellectual property, or a meta-art form of its own? (One of my favorite avant-garde piano compositions contains the instructions “The performer should prepare any composition and then perform it as well as he can.”)

So then, what kind of statement about the originality, uniqueness, or authorship of an artwork could be established by an NFT? Beeple, who sold an NFT titled “Everydays: The First 5000 Days” for over $69 Million, says that the NFT is not about ownership of the copyright: “You can display the token and show you own the token, but, you don’t own the copyright.” I presume Beeple still owns the copyright to his work–does that mean he can sell it again? The NFT doesn’t typically include the bits that make up the artwork (I think this is possible, but only for very small objects); as @jonty points out, what the NFT actually contains isn’t the work, but a URL, a link.  That URL points to a resource (a JSON metadata file or an IPFS hash) that’s most likely on a server operated by a startup. And that resource points to the work. If that link becomes invalid (for example, if the startup goes bust), then all you “own” is an invalid link. A 404.

Some of these problems may be addressable; some aren’t.  The bottom line, though, is that the link between a creator and a work of art can’t be established by cryptographic checksums.

So do NFTs create a market for artwork that didn’t exist before?  Perhaps–though if what’s bought and sold isn’t the actual work (which remains infinitely and perfectly reproducible), or even the right to reproduce the work (copyright), it’s not clear to me how this really benefits artists, or even how it changes the picture much.  I suppose this is a sort of 21st century patronage, in which someone rich gives an artist a pile of money for being an artist (or gives Jack Dorsey money for being @jack). As patronage, it’s more like Prince Esterhazy than Patreon. A few artists will make money, perhaps even more money than they would otherwise, because I see no reason you can’t sell the work itself in addition to the NFT. Or sell multiple NFTs referencing the same work. But most won’t. The irreducible problem of being an artist–whether that’s a musician, a painter, or a sculptor, whether the medium is digital or physical–is that there are more people who want the job than are people willing to pay.

In the end, what do NFTs create? A kind of digital fetishism around possessing bits, but perhaps not much else. An NFT shows that you are able to spend money on something–without involving the “something” itself. As Beeple says, “you can display the token.” This is conspicuous consumption in perhaps its purest form. It’s like buying jewelry and framing the receipt. That an explosion in conspicuous consumption should arise at this point in history isn’t surprising. The tech community is awash in wealth: wealth from unicorn startups that will never make a cent of profit, wealth from cryptocurrencies that are very difficult to use to buy or sell anything. What’s the value of being rich if you can’t show it off? How do you show something off during a socially distanced pandemic? And if all you care about is showing off your wealth, the NFT is where the real value lies, not in the artwork. You can buy, sell, or trade them, just like baseball cards. Just don’t mistake an NFT for “ownership” in anything but the NFT itself.

Banksy’s self-destroying artwork was much more to the point. Unlike Banksy’s many public murals, which anyone can enjoy for free, this painting shredded itself as soon as it was bought at auction. Buying it destroyed it.

Radar trends to watch: April 2021 [Radar]

March was a busy month. There’s been a lot of talk about augmented and virtual reality, with hints and speculation about products from Apple and Facebook. In the next two years, we’ll see whether this is more than just talk. We’ve also seen more people discussing operations for machine learning and AI, including a substantive talk by Andrew Ng. We’ve long believed that operations was the unacknowledged elephant in the room; it’s finally making it into the open. And we’ve had our share of bad news: proposals for military use of AI, increased surveillance (for example, automated license plate readers at luxury condominiums connected to police departments). More than ever, we have to ask ourselves what kind of world we want to build.

AI

  • Contentyze is a free, publicly available language model that claims to be GPT-3-like. It works fairly well. Wired also points to a free GPT-3-like model called Eleuther.
  • The AI Infrastructure Alliance wants to describe a canonical stack for AI, analogous to LAMP or MEAN; they see see it as a way to free AI from domination by the technology giants.
  • Global treaties on the use of AI in warfare?  The time may have come.  But verifying compliance is extremely difficult.  Nuclear weapons are easy in comparison.
  • Operations for Machine Learning (i.e., integrating it into CI/CD processes) is the big challenge facing businesses in the coming years. This isn’t the first time operations for ML and AI have appeared in Trends…  but people are getting the message.
  • The next step in AI is Multimodal: AI that combines multiple abilities and multiple senses, starting with computer vision and natural language.
  • Smart drones kill moths by crashing into them, to prevent damage to crops. Pesticide-free agricultural pest control.
  • Tesla’s fully self-driving car isn’t fully self-driving, and that’s the good part. Musk still seems to think he can have a fully self-driving car by the end of 2021, apparently by skipping the hard work.
  • Turn any dial into an API with a camera and some simple computer vision: Pete Warden’s notion of TinyAI could be used to make everything machine-readable, including electric meters and common appliances.
  • The National Security Commission on Artificial Intelligence has published a huge and wide-ranging report on the future development of AI in the US, covering both business and military applications. Recommendations include the military development of AI-based weapons, and the creation of a quasi-military academy for developing AI expertise.
  • A robotic lifeguard: an autonomous underwater robot for rescuing swimmers.

Data

  • We have been building centralized data systems for the past decade. The pendulum is about to swing the other way: data decentralization will be driven in part by regulation, in part by changes in advertising platforms, and in part by competition between cloud platforms.
  • Thoughtworks’ thoughts on building a digital healthcare ecosystem: put the patients first (not the providers), make data accountable, build and share knowledge, leverage new technology.
  • Empowering the public to resist the surveillance state: data strikes, data poisoning, reimagined as collective action, in a paper presented at the FaccT conference.

Social Media

  • Zuckerberg proposes that social media platforms “should be required to demonstrate that they have systems in place for identifying unlawful content and removing it.” Such a policy would give a significant advantage to established players–but in that, it’s not unlike laws requiring safe disposal of toxic waste.
  • Either ignoring or unaware of the potential for abuse, Slack added a feature allowing unblockable direct messages from paid users to any users of the system (not just users from the same organization). While message delivery in Slack can be stopped, email containing the message body can’t. Slack is promising to fix this feature.

Programming

  • Nokia has released the Plan 9 Operating System (started by Rob Pike, Brian Kernighan, and Dennis Ritchie) under the open source MIT license.  No one knows whether it will prosper, but it is the first significantly new operating system we’ve seen in years.
  • An important take on performance: it’s not about speeds, it’s about statistics and what happens at the edges of the distribution. Understanding queuing theory is the key, not MHz and Mbps.
  • Is Microsoft’s low-code, Excel-based open source programming language Power Fx what brings programming to the masses?
  • Non-Intrusive Production Debugging: Is this a trend? Or just a flash in the pan? The ability to run a debugger on code running in production and observe what is happening line-by-line seems like magic.

Augmented Reality

  • As part of its augmented reality strategy, Facebook is developing a non-invasive wristband-based neural interface that lets you control digital objects with thought.
  • The killer app for AR might be audio: smart headphones and hearing aids that can extract important sounds (conversations, for example) from a sea of noise.
  • Mojo Vision has developed very low power chips for use in AR contact lenses.
  • Facebook is talking more about its AR/VR glasses, along with new kinds of user interfaces, in which AI mediates literally every part of the wearer’s experience.

Security

  • Google’s ProjectZero security team, which has been responsible for disclosing many vulnerabilities (and getting vendors to fix them), has just exposed a number of vulnerabilities that were actively being used by government organizations in counter-terrorist activities.
  • Botnets have been observed storing key configuration information in cryptocurrency blockchains, including the IP addresses of infected systems. Taking down the botnet’s control server is no longer an effective defense, because the server can easily be rebuilt.
  • Tens of thousands of Microsoft Exchange Server installations have been compromised. Some of the servers may have been attacked by a group connect to the Chinese government, though there are several variants of the attack, suggesting multiple actors.
  • The problem with a walled garden: once the attackers are in, the walls are protecting them, too.  iOS’s security features make successful attacks very difficult; but when they succeed, they are almost impossible to detect.

Biology

  • The CRISPR equivalent of a laptop: Onyx is a small, portable, and (relatively) inexpensive tool for automating CRISPR gene editing.  It could make CRISPR much more widely accessible, much as the laptop did for computing.
  • AI and NVidia have made a breakthrough in using deep learning for genetic research. In addition to reducing the time to do some analyses from days to hours, they have significantly reduced the number of cells needed, making it easier to do research on rare genetic diseases.

Web

  • California has banned user interface “dark patterns”: intentionally confusing user interface designs used to prevent people from opting out of data collection.
  • “Headless” wordpress: WordPress as an API for content, using the JAMstack (JavaScript, APIs, and Markup) for rendering rather than PHP.
  • Project Gemini claims to be recreating the web.  It’s more than gopher, but not much more.  My biggest question is whether anyone cares about old-style “internet browsing” any more?
  • Is the next step for web developers HTML over WebSockets?  Developers are starting to realize that browser-side JavaScript development has resulted in spiraling complexity, poor performance, and buggy applications.

Quantum Computing

Blockchain

  • Non-fungible tokens (NFTs) have taken the blockchain world by storm. But it’s not clear that NFTs have any real application. What is the value of proving that you own a tweet or an emoji?

InfoTribes, Reality Brokers [Radar]

It seems harder than ever to agree with others on basic facts, let alone to develop shared values and goals: we even claim to live in a post-truth era1. With anti-vaxxers, QAnon, Bernie Bros, flat earthers, the intellectual dark web, and disagreement worldwide as to the seriousness of COVID-19 and the effectiveness of masks, have we lost our shared reality? For every piece of information X somewhere, you can likely find “not X” elsewhere. There is a growing disbelief and distrust in basic science and government. All too often, conversations on social media descend rapidly to questions such as “What planet are you from?

Reality Decentralized

What has happened? Reality has once again become decentralized. Before the advent of broadcast media and mass culture, individuals’ mental models of the world were generated locally, along with their sense of reality and what they considered ground truth. With broadcast media and the culture industries came the ability to forge top-down, national identities that could be pushed into the living rooms of families at prime time, completing the project of the press and newspapers in nation-forming2. The creation of the TV dinner was perhaps one of the most effective tools in carving out a sense of shared reality at a national level (did the TV dinner mean fewer people said Grace?).

The rise of the Internet, Search, social media, apps, and platforms has resulted in an information landscape that bypasses the centralized knowledge/reality-generation machine of broadcast media. It is, however, driven by the incentives (both visible and hidden) of significant power structures, such as Big Tech companies. With the degradation of top-down knowledge, we’ve seen the return of locally-generated shared realities, where local now refers to proximity in cyberspace. Content creators and content consumers are connected, share information, and develop mental models of the world, along with shared or distinct realities, based on the information they consume. They form communities and shared realities accordingly and all these interactions are mediated by the incentive systems of the platforms they connect on.

As a result, the number of possible realities has proliferated and the ability to find people to share any given reality with has increased. This InfoLandscape we all increasingly occupy is both novel and shifting rapidly. In it, we are currently finding people we can share some semblance of ground truth with: we’re forming our own InfoTribes, and shared reality is splintering around the globe.

To understand this paradigm shift, we need to comprehend:

  • the initial vision behind the internet and the InfoLandscapes that have emerged,
  • how we are forming InfoTribes and how reality is splintering,
  • that large-scale shared reality has merely occupied a blip in human history, ushered in by the advent of broadcast media, and
  • who we look to for information and knowledge in an InfoLandscape that we haven’t evolved to comprehend.

The InfoLandscapes

“Cyberspace. A consensual hallucination experienced daily by billions of legitimate operators, in every nation, by children being taught mathematical concepts… A graphic representation of data abstracted from the banks of every computer in the human system. Unthinkable complexity. Lines of light ranged in the nonspace of the mind, clusters, and constellations of data. Like city lights, receding.”

Neuromancer, William Gibson (1984)

There are several ways to frame the origin story of the internet. One is how it gave rise to new forms of information flow: the vision of a novel space in which anybody could publish anything and everyone could find it. Much of the philosophy of early internet pioneers was couched in terms of the potential to “flatten organizations, globalize society, decentralize control, and help harmonize people” (Nicholas Negraponte, MIT)3.

As John Perry Barlow (of Grateful Dead fame) wrote in A Declaration of the Independence of Cyberspace (1996):

We are creating a world that all may enter without privilege or prejudice accorded by race, economic power, military force, or station of birth. We are creating a world where anyone, anywhere may express his or her beliefs, no matter how singular, without fear of being coerced into silence or conformity. Your legal concepts of property, expression, identity, movement, and context do not apply to us. They are all based on matter, and there is no matter here.

This may have been the world we wanted but not the one we got. We are veering closer to an online and app-mediated environment similar to Deleuze’s Societies of Control, in which we are increasingly treated as our data and what Deleuze calls “dividuals”: collections of behavior and characteristics, associated with online interactions, passwords, spending, clicks, cursor movements, and personal algorithms, that can be passed into statistical and predictive models and guided and incentivized to behave and spend in particular ways. Put simply, we are reduced to the inputs of an algorithm. On top of this, pre-existing societal biases are being reinforced and promulgated at previously unheard of scales as we increasingly integrate machine learning models into our daily lives.

Prescient visions of society along these lines were provided by William Gibson and Neal Stephenson’s 1992 Snow Crash: societies increasingly interacting in virtual reality environments and computational spaces, in which the landscapes were defined by information flows4. Not only this, but both authors envisioned such spaces being turned into marketplaces and segmented and demarcated by large corporations, only a stone’s throw from where we find ourselves today. How did we get here?

Information Creation

In the early days of the internet, you needed to be a coder to create a website. The ability to publish material was relegated to the technical. It was only in walled gardens such as CompuServe and AOL or after the introduction of tools like Blogger that regular punters were able to create their own websites with relative ease. The participatory culture and user-generated content of Web 2.0 opened up the creative space, allowing anyone and everyone to create content, as well as respond to, rate, and review it. Over the last decade, two new dynamics have drastically increased the amount of information creation, and, therefore, the “raw material” with which the landscape can be molded:

  1. Smartphones with high-resolution video cameras and
  2. The transformation of the attention economy by “social media” platforms, which incentivize individuals to digitize more of their experiences and broadcast as much as possible.

And it isn’t only the generation of novel content or the speed at which information travels. It is also the vast archives of human information and knowledge that are being unearthed, digitized, and made available online. This is the space of content creation.

Information Retrieval

The other necessary side of information flow is discoverability, how it is organized, and where it’s surfaced. When so much of the world’s information is available, what is the method for retrieval? Previously the realm of chat rooms and bulletin boards, this question eventually gave rise to the creation of search engines, social media platforms, streaming sites, apps, and platforms.

Platforms that automate the organizing and surfacing of online content are necessary, given the amount of content currently out there and how much is being generated daily. And they also require interrogating, as we humans base our mental models of how the world works on the information we receive, as we do our senses of reality, the way we make decisions, and the communities we form. Platforms such as Facebook have erected walled gardens in our new InfoLandscape and locked many of us into them, as predicted by both Gibson and Stephenson. Do we want such corporatized and closed structures in our networked commons?

InfoTribes, Shared Reality

A by-product of algorithmic polarization and fragmentation has been the formation of more groups that agree within their own groups and disagree far more between groups, not only on what they value but on ground truth, about reality.

Online spaces are novel forms of community: people who haven’t met and may never meet in real life interacting in cyberspace. As scholars such as danah boyd have made clear, “social network sites like MySpace and Facebook are networked publics, just like parks and other outdoor spaces can be understood as publics.”

One key characteristic of any community is a sense of shared reality, something agreed upon. Communities are based around a sense of shared reality, shared values, and/or shared goals. Historically, communities have required geographical proximity to coalesce, whereas online communities have been able to form outside the constraints of meatspace. Let’s not make the mistake of assuming online community formation doesn’t have constraints. The constraints are perhaps more hidden, but they exist: they’re both technological and the result of how the InfoLandscapes have been carved out by the platforms, along with their technological and economic incentives5. Landscapes and communities have co-evolved, although, for most of history, on different timescales: mountain ranges can separate parts of a community and, conversely, we build tunnels through mountains; rivers connect communities, cities, and commerce, and humans alter the nature of rivers (an extreme example being the reversal of the Chicago River!).

The past two decades have seen the formation of several new, rapidly and constantly shifting landscapes that we all increasingly interact with, along with the formation of new information communities, driven and consolidated by the emergent phenomena of filter bubbles and echo chambers, among many others, themselves driven by the platforms’ drive for engagement. What the constituents of each of these communities share are mental models of how the world works, senses of reality, that are, for the most part, reinforced by the algorithms that surface content, either by 1) showing content you agree with to promote engagement or 2) showing content you totally disagree with to the same end. Just as the newspaper page has historically been a mish-mash collection of movie ads, obituaries, and opinions stitched together in a way that made the most business and economic sense for any given publisher, your Facebook feed is driven by a collection of algorithms that, in the end, are optimizing for growth and revenue6. These incentives define the InfoLandscape and determine the constraints under which communities form. It just so happens that dividing people increases engagement and makes economic sense. As Karen Hao wrote recently in the MIT Technology Review, framing it as a result of “Zuckerberg’s relentless desire for growth,” which is directly correlated with economic incentives:

The algorithms that underpin Facebook’s business weren’t created to filter out what was false or inflammatory; they were designed to make people share and engage with as much content as possible by showing them things they were most likely to be outraged or titillated by.

The consequence? As groups of people turn inward, agreeing more amongst their in-group, and disagreeing more fervently with those outside of it, the common ground in between, the shared reality, which is where perhaps the truth lies, is slowly lost. Put another way, a by-product of algorithmic polarization and fragmentation has been the formation of more groups that agree within their own groups and disagree far more with other groups, not only on what they value but on ground truth, about reality.

We’ve witnessed the genesis of information tribes or InfoTribes and, as these new ideological territories are carved up, those who occupy InfoLandscapes hold that ground as a part of an InfoTribe7. Viewed in this way, the online flame wars we’ve become all too accustomed to form part of the initial staking out of territory in these new InfoLandscapes. Anthropologists have long talked about tribes as being formed around symbols of group membership, symbols that unite a people, like totem animals, flags, or… online content.

Reality Brokers, Reality Splintering

The platforms that “decide” what we see and when we see it are reality brokers in a serious sense: they guide how individuals construct their sense of the world, their own identities, what they consider ground truth, and the communities they become a part of.

Arguably, many people aren’t particularly interested in the ground truth per se, they’re interested in narratives that support their pre-existing mental models of the world, narratives that help them sleep at night. This is something that 45 brilliantly, and perhaps unwittingly, played into and made starkly apparent, by continually sowing seeds of confusion, gaslighting the global community, and questioning the reality of anything that didn’t serve his own purposes.

This trend isn’t confined to the US. The rise of populism more generally in the West can be seen as the result of diverging senses of reality, the first slice splitting people across ideological and party lines. Why are these divergences in a sense of shared reality becoming so exacerbated and apparent now? The unparalleled velocity at which we receive information is one reason, particularly as we likely haven’t evolved to even begin to process the vast amounts we consume. But it isn’t only the speed and amount, it’s the structure. The current media landscape is highly non-linear, as opposed to print and television. Our sense-making and reality-forming faculties are overwhelmed daily by the fractal-like nature of (social) media platforms and environments that are full of overlapping phenomena and patterns that occur at many different frequencies8. Moreover, the information we’re served is generally driven by opaque and obscure economic incentives of platforms, which are protected by even more obscure legislation in the form of Section 230 in the US (there are other incentives at play, themselves rarely surfaced, in the name of “trade secrets”).

But let’s be careful here: it isn’t tech all the way down. We’re also deep in a several decades-long erosion of institutional knowledge, a mistrust in both science and government being the two most obvious. Neoliberalism has carved out the middle class while the fruits of top-down knowledge have left so many people unserved and behind. On top of this, ignorance has been actively cultivated and produced. Look no further than the recent manufacturing of ignorance from the top down with the goals of chaos creation, sowing the seeds of doubt, and delegitimizing the scientific method and data reporting (the study of culturally induced ignorance is known as agnotology and Proctor and Scheibinger’s book Agnotology: The Making and Unmaking of Ignorance is canonical). On top of this, we’ve seen the impact of bad actors and foreign influence (not mutually exclusive) on the dismantling of shared reality, such as Russian interference around the 2016 US election.

This has left reality up for grabs and, in an InfoLandscape exacerbated by a global pandemic, those who control and guide the flow of information also control the building of InfoTribes, along with their shared realities. Viewed from another perspective, the internet is a space in which information is created and consumed, a many-sided marketplace of supply-and-demand in which the dominant currency is information, albeit driven by a shadow market of data, marketing collateral, clicks, cash, and crypto. The platforms that “decide” what we see and when we see it are reality brokers in a serious sense: they guide how individuals construct their sense of the world, their own identities, what they consider ground truth, and the communities they become a part of. In some cases, these reality brokers may be doing it completely by accident. They don’t necessarily care about the ground truth, just about engagement, attention, and profit: the breakdown of shared reality as collateral damage of a globalized, industrial-scale incentive system. In this framework, the rise of conspiracy theories is an artefact of this process: the reality brokered and formed, whether it be a flat earth or a cabal of Satan-worshipping pedophiles plotting against 45, is a direct result of the bottom-up sense-making of top-down reality splintering, the dissolution of ground truth and the implosion of a more general shared reality. Web 2.0 has had a serious part to play in this reality splintering but the current retreat away into higher signal and private platforms such as newsletters, Slack, Discord, WhatsApp, and Signal groups could be more harmful, in many ways.

Shared reality is breaking down. But was it even real in the first place?

Shared Reality as Historical Quirk

Being born after World War Two could lead one to believe that shared reality is foundational for the functioning of the world and that it’s something that always existed. But there’s an argument that shared reality, on national levels, was really ushered in by the advent of broadcast media, first the radio, which was in over 50% of US households by the mid-1930s, and then the television, nuclear suburban families, and TV dinners. The hegemonic consolidation of the American dream was directly related to the projection of ABC, CBS, and NBC into each and every household. When cable opened up TV to more than three major networks, we began to witness the fragmentation and polarization of broadcast media into more camps, including those split along party lines, modern exemplars being Fox News and CNN. It is key to recognize that there were distinct and differing realities in this period, split along national lines (USA and Soviet Russia), ideological lines (pro- and anti-Vietnam), and scientific lines (the impact of smoking and asbestos). Even then, it was a large number of people with a small number of shared realities.

The spread of national identity via broadcast media didn’t come out of the blue. It was a natural continuation of similar impacts of “The Printed Word,” which Marshall McLuhan refers to as an “Architect of Nationalism” in Understanding Media:

Socially, the typographic extension of man brought in nationalism, industrialism, mass markets, and universal literacy and education. For print presented an image of repeatable precision that inspired totally new forms of extending social energies.

Note that the shared realities generated in the US in the 20th century weren’t only done so by national and governmental interests, but also by commercial and corporate interests: mass culture, the culture industries, culture at scale as a function of the rise of the corporation. There were strong incentives for commercial interests to create shared realities at scale across the nation because it’s easier to market and sell consumer goods, for example, to a homogeneous mass: one size fits all, one shape fits all. This was achieved through the convergence of mass media, modern marketing, and PR tactics.

Look no further than Edward Bernays, a double nephew of Freud who was referred to in his obituary as “the Father of Public Relations.” Bernays famously “used his Uncle Sigmund Freud’s ideas to help convince the public, among other things, that bacon and eggs was the true all-American breakfast.” In the abstract of his 1928 paper “Manipulating Public Opinion: The Why and the How,” Bernays wrote:

If the general principles of swaying public opinion are understood, a technique can be developed which, with the correct appraisal of the specific problem and the specific audience, can and has been used effectively in such widely different situations as changing the attitudes of whites toward Negroes in America, changing the buying habits of American women from felt hats to velvet, silk, and straw hats, changing the impression which the American electorate has of its President, introducing new musical instruments, and a variety of others.

The Century of Marketing began, in some ways, with psychoanalytical tools, marketing as a mode of reality generation, societal homogenization, and behavioral modification. A paradigm of this is how DeBeers convinced the West to adopt diamonds as the necessary gem for engagement rings. A horrifying and still relevant example is Purdue Pharma and the Sackler dynasty’s marketing of OxyContin.

The channels used by marketers were all of the culture industries, including broadcast media, a theme most evident in the work of the Frankfurt School, notably in that of Theodor Adorno and Max Horkheimer. Look no further than Adorno’s 1954 essay “How to Look at Television“:

The old cultured elite does not exist any more; the modern intelligentsia only partially corresponds to it. At the same time, huge strata of the population formerly unacquainted with art have become cultural “consumers.”

Although it was all the culture industries of the 20th century that worked to homogenize society at the behest of corporate interests, television was the one that we brought into our living rooms and that we eventually watched with family over dinner. Top-down reality-generation was centralized and projected into nuclear suburban homes.

Fast forward to today, the post-broadcast era, in which information travels close to the speed of light, in the form of lasers along fiber-optic cables and it’s both multi-platformed and personalized and everyone is a potential creator: reality, once again, is decentralized. In this frame, the age of shared reality was the anomaly, the exception rather than the rule. It’s perhaps ironic that one of the final throes of the age of shared reality was the advent of reality TV, a hyper-simulation of reality filtered through broadcast media. So now, in a fractured and fractal InfoLandscape, who do we look to in our efforts to establish some semblance of ground truth?

Verified Checkmarks and Village Elders

If our online communities are our InfoTribes, then the people we look to for ground truth are our village elders, those who tell stories around the campfire.

When COVID-19 hit, we were all scrambling around for information about reality in order to make decisions, and not only were the stakes a matter of life and death but, for every piece of information somewhere, you could find the opposite somewhere else. The majority of information, for many, came through social media feeds. Even when the source was broadcast media, a lot of the time it would be surfaced in a social media feed. Who did I pay attention to? Who did I believe? How about you? For better or for worse, I looked to my local (in an online sense) community, those whom I considered closest to me in terms of shared values and shared reality. On top of this, I looked to those respected in my communities. On Twitter, for example, I paid attention to Dr Eleanor Murray and Professor Nicholas Christakis, among many others. And why? They’re both leaders in their fields with track records of deep expertise, for one. But they also have a lot of Twitter followers and have the coveted blue verified checkmarks: in an InfoLandscape of such increasing velocity, we use rules of thumbs and heuristics around what to believe and what to not, including the validity and verifiability of the content creator, signaled by the number of followers, who the followers are (do I follow any of them? And what do I think of them?), and whether or not the platform has verified them.

If our online communities are our InfoTribes, then the people we look to for ground truth are our village elders, those who tell stories around the campfire. In the way they have insight into the nature of reality, we look to them as our illiterate ancestors looked to those who could read or as Pre-Reformation Christians looked to the Priests who could read Biblical Latin. With the emergence of these decentralized and fractured realities, we are seeing hand-in-hand those who rise up to define the realities of each InfoTribe. It’s no wonder the term Thought Leader rose to prominence as this landscape clarified itself. We are also arguably in the midst of a paradigm shift from content being the main object of verification online to content creators themselves being those verified. As Robyn Caplan points out astutely in Pornhub Is Just the Latest Example of the Move Toward a Verified Internet:

It is often said that pornography drives innovation in technology, so perhaps that’s why many outlets have framed Pornhub’s verification move as “unprecedented.” However, what is happening on Pornhub is part of a broader shift online: Many, even most, platforms are using “verification” as a way to distinguish between sources, often framing these efforts within concerns about safety or trustworthiness.

But mainstream journalists are more likely to be verified than independent journalists, men more likely than women, and, as Caplan points out “there is a dearth of publicly available information about the demographics of verification in general—for instance, whether BIPOC users are verified at the same rates as white users.” And it is key to note that many platforms are increasingly verifying and surfacing content created by “platform partners,“ an approach also driven by business incentives. Who decides who we listen to? And, as Shoshana Zuboff continually asks, Who decides who decides?

This isn’t likely to get better anytime soon, with the retreat to private and higher signal communication channels, the next generation of personalized products, the advent of deep fakes, the increasing amount of information we’ll be getting from voice assistants over the coming 5-10 years, the proportion of information consumed via ephemeral voice-only apps such as Clubhouse, and the possibility of augmented reality playing an increasing role in our daily lives.

So what to do? Perhaps instead of trying to convince people of what we believe to be true, we need to stop asking “What planet are you from?” and start looking for shared foundations in our conversations, a sense of shared reality. We also have a public awareness crisis on our hands as the old methods of media literacy and education have stopped working. We need to construct new methods for people to build awareness, educate, and create the ability to dissent. Public education will need to bring to light the true contours of the emergent InfoLandscapes, some key aspects of which I have attempted to highlight in this essay. It will also likely include developing awareness of all our information platforms as multi-sided marketplaces, a growing compendium of all the informational dark patterns at play, the development of informational diets and new ways to count InfoCalories, and bringing antitrust suits against the largest reality brokers. Watch these spaces.


Many thanks to Angela Bowne, Anthony Gee, Katharine Jarmul, Jamie Joyce, Mike Loukides, Emanuel Moss, and Peter Wang for their valuable and critical feedback on drafts of this essay along the way.


Footnotes

1. A term first coined in 1990 by the playwright Steve Teisch and that was the Oxford Dictionaries 2016 Word of the Year (source: Post-Truth and Its Consequences: What a 25-Year-Old Essay Tells Us About the Current Moment)
2. See Benedict Anderson’s Imagined Communities for more about the making of nations through shared reading of print media and newspapers.
3. I discovered this reference in Fred Turner’s startling book From Counterculture to Cyberculture, which traces the countercultural roots of the internet to movements such as the New Communalists, leading many tech pioneers to have a vision of the web as “a collaborative and digital utopia modeled on the communal ideals” and “reimagined computers as tools for personal [and societal] liberation.”
4. There is a growing movement recognizing the importance of information flows in society. See, for example, OpenMined’s free online courses which are framed around the theme that “Society runs on information flows.”
5. Think Twitter, for example, which builds communities by surfacing specific tweets for specific groups of people, a surfacing that’s driven by economic incentives, among others; although do note that TweetDeck, owned by Twitter, does not show ads, surface tweets, or recommend follows: perhaps the demographic that mostly uses TweetDeck doesn’t click on ads?
6. Having said this, there are some ethical constraints in the physical publishing business, for example, you can’t run an ad for a product across from an article or review of the product; there are also forms of transparency and accountability in physical publishing: we can all see what any given broadsheet publishes, discuss it, and interrogate it collectively.
7. Related concepts are the digital tribe, a group of people who share common interests online, and the memetic tribe, “a group of agents with a meme complex, or memeplex, that directly or indirectly seeks to impose its distinct map of reality—along with its moral imperatives—on others.”
8. Is it a coincidence that we’re also currently seeing the rise of non-linear note-taking, knowledge base, and networked thought tools, such as Roam Research and Obsidian?

The End of Silicon Valley as We Know It? [Radar]

High-profile entrepreneurs like Elon Musk, venture capitalists like Peter Thiel and Keith Rabois, and big companies like Oracle and HP Enterprise are all leaving California. During COVID-19, Zoom-enabled tech workers have discovered the benefits of remote work from cheaper, less congested communities elsewhere. Is this the end of Silicon Valley as we know it? Perhaps. But other challenges to Silicon Valley’s preeminence are more fundamental than the tech diaspora.

Understanding four trends that may shape the future of Silicon Valley is also a road map to some of the biggest technology-enabled opportunities of the next decades:

  1. Consumer internet entrepreneurs lack many of the skills needed for the life sciences revolution.
  2. Internet regulation is upon us.
  3. Climate response is capital intensive, and inherently local.
  4. The end of the betting economy.

Inventing the future

“The best way to predict the future is to invent it,” Alan Kay once said. 2020 proved him both right and wrong. The coronavirus pandemic, or something worse, had long been predicted, but it still caught the world unprepared, a better future not yet invented. Climate change too has been on the radar, not just for decades but for over a century, since Arrhenius’s 1896 paper on the greenhouse effect. And it has long been known that inequality and caste are corrosive to social stability and predict the fate of nations. Yet again and again the crisis finds us unprepared when it comes.

In each case, though, the long-predicted future is still not foreordained.