Radar Trends to Watch: July 2022 – O’Reilly

This month, large models are even more in the news than last month: the open source Bloom model is almost finished, Google’s LaMDA is good enough that it can trick people into thinking it’s sentient, and DALL-E has gotten even better at drawing what you ask.

The most important issue facing technology might now be the protection of privacy. While that’s not a new concern, it’s a concern that most computer users have been willing to ignore, and that most technology companies have been willing to let them ignore. New state laws that criminalize having abortions out of state and the stockpiling of location information by antiabortion groups have made privacy an issue that can’t be ignored.

Artificial Intelligence

  • Big Science has almost finished training its open source BLOOM language model, which was developed by volunteer researchers and trained using public funds. Bloom will provide an open, public platform for research into the capabilities of large language models and, specifically,  issues like avoiding bias and toxic language.
  • AI tools like AlphaFold2 can create new proteins, not just analyze existing ones; the unexpected creation of new artifacts by an AI system is playfully called “hallucination.” The proteins designed so far probably aren’t useful; still, this is a major step forward in drug design.
  • Microsoft is limiting or removing access to some features in its face recognition service, Azure Face. Organizations will have to tell Microsoft how and why facial recognition will be used in their systems; and services like emotion recognition will be removed completely.
  • Amazon plans to give Alexa the ability to imitate anyone’s voice, using under a minute of audio. They give the example of a (possibly dead) grandmother “reading” a book to a child. Other AI vendors (most notably OpenAI/Microsoft) have considered such mimicry unethical.
  • Dolt is a SQL database that lets you version data using git commands, You can clone, push, pull, fork, branch, and merge just as with git; you access data using standard SQL.
  • It’s sadly unsurprising that a robot incorporating a widely-used neural network (OpenAI CLIP) learns racist and sexist biases, and that these biases affect its performance on tasks.
  • Building autonomous vehicles with memory, so that they can learn about objects on the routes they drive, may be an important step in making AV practical. In real life, most people drive over routes they are already familiar with. Autonomous vehicles should have the same advantage.
  • The argument about whether Google’s LaMDA is “sentient” continues, with a Google engineer placed on administrative leave for publishing transcripts of conversations that he claimed demonstrate sentience. Or are large language models just squirrels?
  • For artists working in collaboration with AI, the possibilities and imperfections of AI are a means of extending their creativity.
  • Pete Warden’s proposal for ML Sensors could make developing embedded ML systems much simpler: push the machine learning into the sensors themselves.
  • Researchers using DALL-E 2 discovered that the model has a “secret vocabulary” that’s not human language, but that can be used somewhat reliably to create consistent pictures. It may be an artifact of the model’s inability to say “I didn’t understand that”; given nonsense input, it is pulled towards similar words in the training corpus.
  • HuggingFace has made an agreement with Microsoft that will allow Azure customers to run HuggingFace language models on the Azure platform.
  • The startup Predibase has built a declarative low-code platform for building AI systems. In a declarative system, you describe the outcome you want, rather than the process for creating the outcome. The system figures out the process.
  • Researchers are developing AI models that implement metamemory: the ability to remember whether or not you know something.
  • As the population ages, it will be more important to diagnose diseases like Alzheimer’s early, when treatment is still meaningful. AI is providing tools to help doctors analyze MRI images more accurately than humans. These tools don’t attempt diagnosis; they provide data about brain features.
  • Google has banned the training of Deepfakes on Colab, its free Jupyter-based cloud programming platform.

Metaverse

  • Samsung and RedHat are working on new memory architectures and device drivers that will be adequate to the demands of a 3D-enabled, cloud-based metaverse.
  • The Metaverse Standards Forum is a new industry group with the goal of solving interoperability problems for the Metaverse. It views the Metaverse as the outgrowth of the Web, and plans to coordinate work between existing standards groups (like the W3C) relevant to the Metaverse.
  • Can the “Open Metaverse” be the future of the Internet?  The Open Metaverse Interoperability Group is building vendor-independent standards for social graphs, identities, and other elements of a Metaverse.
  • Holographic heads-up displays allow for 3D augmented reality: the ability to project 3D images onto the real world (for example, onto a car’s windshield).
  • Google’s Visual Position Service uses the data they’ve collected through Street View to provide high-accuracy positioning data for augmented reality applications. (This may be related to Niantic’s VPS, or they may just be using the same acronym.)

Security

Programming

  • Amazon has launched CodeWhisperer, a direct competitor to GitHub Copilot.
  • Linus Torvalds predicts that Rust will be used in the Linux kernel by 2023.
  • GitHub Copilot is now generally available (for a price); it’s free to students and open source maintainers. Corporate licenses will be available later this year.
  • WebAssembly is making inroads. The universal WebAssembly runtime, Wasmer, runs any code, on any platform. Impressive, if it delivers.
  • Can WebAssembly replace Docker? Maybe, in some applications. WASM provides portability and eliminates some security issues (possibly introducing its own); Docker sets up environments.
  • Mozilla’s Project Bergamot is an automated translation tool designed for use on the Web. It can be used to build multilingual forms and other web pages. Unlike most other AI technologies, Bergamot runs in the browser using WASM. No data is sent to the cloud.
  • Microsoft has released a framework called Fluid for building collaborative apps, such as Slack, Discord, and Teams. Microsoft will also be releasing Azure Fluid Relay to support Fluid-based applications.
  • Dragonfly is a new in-memory database that claims significantly faster performance than memcached and Redis.
  • The Chinese government has blocked access to open source code on Gitee, the Chinese equivalent to GitHub, saying that all code must be reviewed by the government before it can be released to the public.

Web3

  • Is Blockchain Decentralized? A study commissioned by DARPA investigates whether a blockchain is truly immutable, or whether it can be modified without exploiting cryptographic vulnerabilities, but by attacking the blockchain’s implementation, networking, and consensus protocols. This is the most comprehensive examination of blockchain security that we’ve seen.
  • Jack Dorsey has announced that he’s working on Web5, which will be focused on identity management and be based on Bitcoin.
  • Molly White’s post questioning the possibility of acceptably non-dystopian self-sovereign identity is a must-read; she has an excellent summary and critique of just about all the work going on in the field.
  • Cryptographer Matthew Green makes an important argument for the technologies behind cryptocurrency (though not for the current implementations).

Biology

Quantum Computing

  • Probabilistic computers, built from probabilistic bits (p-bits), may provide a significant step forward for probabilistic decision making. This sounds esoteric, but it’s essentially what we’re asking AI systems to do. P-bits may also be able to simulate q-bits and quantum computing.
  • A system that links two time crystals could be the basis for a new form of quantum computing. Time crystals can exist at room temperature, and remain coherent for much longer than existing qubit technologies.