FLIRTCXceilingmount CCTV Camera

When will AI trasnform surveillance?

AI or Artificial Intelligence is changing the security industry with the power and scope of always on networked computers.

We usually think of surveillance cameras as digital eyes, watching over us or watching out for us, depending on your view. But really, they’re more like portholes: useful only when someone is looking through them. Sometimes that means a human watching live footage, usually from multiple video feeds. Most surveillance cameras are passive, however. They’re there as a deterrence, or to provide evidence if something goes wrong. Your car got stolen? Check the CCTV.

But this is changing — and fast. Artificial intelligence is giving surveillance cameras digital brains to match their eyes, letting them analyze live video with no humans necessary. This could be good news for public safety, helping police and first responders more easily spot crimes and accidents and have a range of scientific and industrial applications. But it also raises serious questions about the future of privacy and poses novel risks to social justice.

A recent article by James Vincent published the Verge makes interesting reading and asks probing questions about trust in Governments. – What happens when governments can track huge numbers of people using CCTV? When police can digitally tail you around a city just by uploading your mugshot into a database? Or when a biased algorithm is running on the cameras in your local mall, pinging the cops because it doesn’t like the look of a particular group of teens?

These scenarios are still a way off, but we’re already seeing the first fruits of combining artificial intelligence with surveillance. IC Realtime is one example. Its flagship product, unveiled last December, was billed as Google for CCTV. It’s an app and web platform named Ella that uses AI to analyze what’s happening in video feeds and make it instantly searchable. Ella can recognize hundreds of thousands of natural language queries, letting users search footage to find clips showing specific animals, people wearing clothes of a certain color, or even individual car makes and models.


In a web demo, IC Realtime CEO Matt Sailor showed The Verge a version of Ella hooked up to around 40 cameras surveilling an industrial park. He typed in various searches — “a man wearing red,” “UPS vans,” “police cars” — all of which brought up relevant footage in a few seconds. He then narrowed the results by time period and location and pointed out how users can give thumbs-up or thumbs-down to clips to improve the results — just like Netflix.

Let’s say there’s a robbery and you don’t really know what happened,” says Sailor. “But there was a Jeep Wrangler speeding east afterward. So we go in, we search for ‘Jeep Wrangler,’ and there it is.” On-screen, clips begin to populate the feed, showing different Jeep Wranglers gliding past. This will be the first big advantage of combining AI and CCTV, explains Sailor: making it easy to find what you’re looking for. “Without this technology, you’d know nothing more than your camera, and you’d have to sift through hours and hours and hours of video,” he says.


Ella runs on Google Cloud and can search footage from pretty much any CCTV system. “[It] works well on a one-camera system — just [like] a nanny cam or dog cam — all the way up to enterprise, with a matrix of thousands of cameras,” says Sailor. Users will pay a monthly fee for access, starting at around $7, and scaling up with the number of cameras.

IC Realtime wants to target businesses of all sizes but thinks its tech will also appeal to individual consumers. These customers are already well-served by a booming market for “smart” home security cams made by companies like Amazon, Logitech, Netgear, and the Google-owned Nest. But Sailor says this tech is much more rudimentary than IC Realtime’s. These cameras connect to home Wi-Fi and offer live streams via an app, and they automatically record footage when they see something move. But, says Sailor, they can’t tell the difference between a break-in and a bird, leading to a lot of false positives. “They’re very basic technology that’s been around for years,” he says. “No AI, no deep learning.”

That won’t be the case for long. While IC Realtime offers cloud-based analytics that can upgrade existing, dumb cameras, other companies are building artificial intelligence directly into their hardware. Boulder AI is one such startup, selling “vision as a service” using its own standalone AI cameras. The big advantage of integrating AI into the device is that they don’t require an internet connection to work. Boulder sells to a wide range of industries, tailoring the machine vision systems it builds to individual clients.

“The applications are really all over the board,” founder Darren Odom tells The Verge. “Our platform’s sold to companies in banking, energy. We’ve even got an application where we’re looking at pizzas, determining if they’re the right size and shape.”

Odom gives the example of a customer in Idaho who had built a dam. In order to meet environmental regulations, they were monitoring the numbers of fish moving making it over the top of the structure. “They used to have a person sitting with a window into this fish ladder, ticking off how many trout went by,” says Odom. (A fish ladder is exactly what it sounds like: a stepped waterway that fish use to travel uphill.) “Then they moved to video and someone [remotely] watching it.” Finally, they contacted Boulder, which built them a custom AI CCTV system to identify types of fish going up the fish ladder. “We really nailed fish species identification using computer vision,” Odom says proudly. “We are now 100 percent at identifying trout in Idaho.”

If IC Realtime represents the generic end of the market, Boulder shows what a boutique contractor can do. In both cases, though, what these firms are currently offering is just the tip of the iceberg. In the same way that machine learning has made swift gains in its ability to identify objects, the skill of analyzing scenes, activities, and movements is expected to rapidly improve. Everything’s in place, including the basic research, the computing power, and the training datasets — a key component in creating competent AI. Two of the biggest datasets for video analysis are made by YouTube and Facebook, companies that have said they want AI to help moderate content on their platforms (though both admit it’s not ready yet). YouTube’s dataset, for example, contains more than 450,000 hours of labeled video that it hopes will spur “innovation and advancement in video understanding.” The breadth of organizations involved in building such datasets gives some idea of the field’s importance. Google, MIT, IBM, and DeepMind are all involved in their own similar projects.

IC Realtime is already working on advanced tools like facial recognition. After that, it wants to be able to analyze what’s happening on-screen. Sailor says he’s already spoken to potential clients in education who want surveillance that can recognize when students are getting into trouble in schools. “They’re interested in preemptive notifications for a fight, for example,” he says. All the system would need to do would be to look out for pupils clumping together and then alert a human, who could check the video feed to see what’s happening or head over in person to investigate.

This is fascanating stuff and an area we will be wathcing closely.

Read more at the Verge.

Scroll to Top