
Sponsored by: NVIDIA
Industrial AI - Part 4: ML Lessons from Lion Vision. AI Failures, and ‘Sensing Like A Robot’.
Introduction: Searching for Vapes, Batteries and Other ‘Hazards’ in eWaste, using AI.
I’ve been writing about the Right to Repair movement, or as I dubbed it ‘The Fight to Repair’ - alluding to how tricky it was to change my headphone batteries back in 2023. I proposed to RS DesignSpark that I ‘go down the rabbit-hole’ of asking why it was this hard: Was it about design, legislation, consumerism, economics, or fashion - I found it was a bit of all of these, and more. However, it was investigating the ‘End of Life’ of consumer electronics that really brought all of these ideas into focus: If you don’t design something for disassembly - be it repair or recycling, it’s going to create serious problems. First for the planet, but with increasing legislation, for businesses who ignore it too.
When I first visited the UK’s largest electrical waste recycling plant, SWEEEP Kuusakoski, in Kent, in November 2023, I had the pleasure of meeting Justin Greenaway, (pictured in the green hard-hat) - and was astonished to hear how dangerous Lio-Ion / Li-Po batteries were, found in anything from laptops to ‘hoverboards’ (pictured below). Perhaps most notoriously Vapes, where 8.2 million ‘Big Puff’ vapes are thrown away each week, in the UK - and many are of course not being disposed of safely, and end up in places which can cause fires, or just leach chemicals into our natural environment.
I had noticed an intriguing orange ‘prism’-like structure, shining brightly on the conveyor belt, as tons of fragments of obliterated electrical goods rushed by. I jokingly asked, “if they were about to be ‘beamed up’ to a new afterlife?!” Justin laughed, as although this apparatus was ‘top secret’, I was not actually far off the mark. I was told to ask again about this mysterious Lion Vision gear, in perhaps a year's time…Respecting the Non-disclosure request, I wrote up my blog for RS DesignSpark, focussing on a terrific interview with Justin, discussing some of the amazing work SWEEEP do, but promised to get back in touch about what this ‘box of tricks’ was doing!
One year later I got in touch with SWEEEP, and asked to be put in touch with Lion Vision. They kindly explained this was a ‘Machine Learning’ project, tasked with trying to ‘detect, visualise and extract’ problem items, such as Vapes and Batteries from eWaste streams, in places like SWEEEP. Above show the ‘live feed’ cameras, being sent over secure internet-link to Lion Vision’s offices in Manchester, on an industrial estate, next to an exhaust pipe manufacturing industry.
I also wrote about ethics and history of AI and ML, in case you’re interested - see previous blogs. (Part 1 - Big Picture, Part 2 - Ethics, Part 3 - Key Terms).
Silly Tests, Serious Applications.
Above: Task - to only select Purple Sweets. Next, all sweets are ‘tagged’ after training by inputting photos of each sweet type/flavour. (The machine is ‘learning’ what each is). The last image is of the Purple ones separated from the rest. This being akin to selecting Vapes from eWaste, etc.
I proposed the ‘Quality Street test’ to Lion Vision, because I needed to drive-home the point that although their NVIDIA-power Machine Learning is most certainly cutting edge, scouring complex messy waste streams for batteries and vapes, I wanted to find a memorable way to illustrate the point that the starting point of using Machine Learning in an industrial process can be much simpler.
If you have not yet embarked on any Machine Learning transformations in the your company, I would stress the following:
a). Your problem may well be easier than you think! You might be sorting things which, although numerous in quantity, are not high complexity, for example, they are not ‘hard to spot’ (like a Purple sweet from the rest of the sweets). The Quality Street sweets example could in fact be done using a much more basic £200 NVIDIA Jetson Nano (252-0055) - more on that later.
Indeed, not everything is as hard as Vapes/Batteries in eWaste. You might want to spot skewed labels on bottled goods, damaged/bruised produce, missing parts, etc. This might be less hard to ‘train’ for Machine Learning than you might suppose. Even the ‘straightness’ of a Cucumber can be defined as good or bad, as this example illustrates, done for ‘shoe-string’ budgets, in Japan.
Above: A nice example is from Makoto Koike, who returned after graduation to his parent’s cucumber farm and built a ‘cucumber sorter’ using a basic Raspberry Pi computer, running TensorFlow (a basic AI programme). Example was from 2016, and of course computers and techniques have moved on a lot since then, but many principles remain the same. Image Credits: Raspberry Pi.
b). The same ‘training’ to spot a given thing, e.g. to ‘pick’ (Purple sweet) or ‘reject’ (Vapes/Batteries), is reasonably similar in its approach. The key is not just the ‘computation’ but also the ‘setup’.
To take the ‘needle in a haystack’ example, one should consider that if you had to *only* ‘look’ for a needle in a haystack, this is metaphorically and literally very hard - a haystack is about 12 feet high and about 8 feet in diameter. Instead, let’s say you elect to feed all the hay along a conveyor belt - and rather than use a camera - you use an X-Ray (or frankly, a big magnet!), chances are, you’re more likely to find said needle from the hay. It’s this sort of ‘rationalising’ the Software with the Hardware which is critical for a good Machine Learning setup.
SWEEEP and Lion Vision are abiding by the same rationalisation principles - and make sure the streams of eWaste are ‘pre-levelled-out’, such that the camera has a better chance of seeing, say, 5-15mm high batteries, mixed in among no more than 20-25mm high chunks of eWaste. If it were buried in the ‘haystack-sized’ piles tipped out from giant lorries, the task would be futile. Hence why this is not just a ‘Machine Learning challenge’, but a ‘Design/Engineering challenge’ - to make the ‘setup’ as favourable as possible for AI/ML to work.
The following is the longer edit of the video at Lion Vision, showing the results, and a special idea!
Above: George Hawkins (Lion Vision) and Jude Pullen (reporting for RS Group), developing the Machine Learning system, running on NVidia, to detect vapes, batteries and sweets!
Lessons from Cucumbers, Vapes, Sweets - and LEGO.
1. Lighting (Input Quality).
By now you’d probably have noticed one common thing between all of these examples - the importance of appropriate lighting. If you’ve ever had good or bad portrait photos taken, lighting is usually most of the reason for a un/desired outcome - and Machine Learning with Cameras is no different!
When I worked at LEGO* as a Tech Scout and Partnerships Director, Machine Learning was starting to grow in interest, with Apps like BrickIt emerging which ‘scanned’ a pile of bricks and suggested things to build… Creatives were excited by what this might mean, and all manner of experiments were taking place (sadly none I can talk about here!). Anyway, this excellent video by Daniel West was unsurprisingly ‘doing the rounds’ among techy folks I knew, both inside and outside of LEGO’s Creative Play Lab. (Also a great follow-up how-it-works video too).
Above: The World’s First [AI] LEGO Sorting Machine by Daniel West. Credits: Daniel West. 2016.
Daniel West’s process is a ‘tour de force’ on how to appreciate the tension between ‘software and setup’ as I’ve mentioned: He spent considerable time ensuring that each part was evenly illuminated, and the above picture (right) shows the ‘light box’ being created, and used a paper (white) conveyor belt to avoid background confusion.
*Side Note: LEGO Bricks vs Ambient Light.
As a very nerdy ‘easter egg’ in the Lion Vision video, I wore my LEGO “George Says Awesome” T-Shirt, which is a reference to LEGO’s first AI Game: Life of George, from 2011, which uses AI-Vision to verify brick arrangements. I thought George (from Lion Vision) might ‘get the joke’, but firstly, I didn’t unzip my jumper enough for him to see it, and secondly, he’s frankly too young to remember it! Which reminded me I should stop being such a nerd and get on with the work!
Anyway, there was a serious side to this related to Machine Learning vision - I know anecdotally from this project that cameras/AI struggle to detect the differences between Yellow and Red bricks - in ambient light. For example, a shadow on a yellow brick can look close to red, and a very brightly lit red brick can look yellow-ish. To the human eye, this sounds implausible, but if you know the Blue / Gold Dress online phenomenon, (where humans can’t agree on what colour stripes are on a dress), then this is the same issue AI faces.
So Daniel’s work is really impressive, but it’s also much harder as the real world is not in a ‘light box’, and random, changeable ambient light can really confuse an AI system. Hence why it matters to try and correct for these things either by design, or by using extra training to correct the perception.
Above: Life of George (a AI lego brick scanning party game). An excerpt from Lion Vision interview, with me discussing the colour issue for AI-vision. The infamous ‘Blue/Gold Dress’ confusion around colours.
Anyway, in summary, lighting needs to be carefully controlled, (or a lot of countertraining given to compensate for random ambient light). However, there are some useful short-cuts, and one that Lion Vision uses to help build the model’s accuracy up is to artificially generate images from real images…
2. Autogeneration (Many-from-One Image Generation).
One of the peculiar things you realise about Machine Learning, is that in some aspects it is very ‘smart’ - it can do things humans can’t do, because we can’t concentrate for hours, or we get distracted, or we miss details. However, it’s also ‘dumb’ in that at the first pass of a new image of say a Brick, Cucumber or Purple Sweet - the AI does not initially recognise an upside-down image from the ‘right way up’!
Above: Before and After ML Training: Left, a ML does not know that a purple sweet is still the same, even when rotated upside down. Right, a ML needs dozens, perhaps hundreds of images with tiny rotations, so it can recognise a sweet from any angle. Thankfully, these can be autogenerated!
A two year old child will recognise that the top left sweet is the same as the bottom left sweet, only rotated 180 degrees, upside down. However, for Machine Learning, it starts to become apparent why this name is apt - as *machines* have a lot to *learn*. In practice, this means you need to generate hundreds of images of the single image, all slightly rotated, so that the machine ‘learns’ that they are all the same. Granted ML systems will start to get better at this, but the fact is, even if this is automated, as it was with Lion Vision (George having written a clever script to do this in a Photoshop-like way), this still requires us humans to realise the task to train a ‘dumb’ or ‘baby’ machine to understand the world. Whether to see this as an opportunity to ask some humbling and profound philosophical questions is entirely your call - but I find that having worked on such projects (and being a father to a young child) it’s hard not to appreciate the marvel of nature, whilst also realising this is gonna need a few late nights to get right!
To try and empathise a little with this apparent ‘dumbness’, one has to appreciate how the human eye works, or indeed, how it’s evolved, and how much insane ‘processing’ even a 2-year-old child's brain is doing!
Above: A great example from Visual Capitalist blog, shows how the eye of even ‘compound’ and ‘simple’ eyes have evolved. It may even surprise you to know the back of our knees are the next most receptive place to light on our bodies, after our eyes - making the point that not all ‘senors’ need to be at the same fidelity or complexity to still be ‘useful’ to use. Indeed, sometimes low fidelity can be an advantage, for example, if privacy is needed, and only relative motion is to be detected.
Getting back to sweets - so it is with pointing a camera at a purple Quality Street sweet. The segments (let’s do a 5x5 grid, below) are individually scanned and various ‘filters’ (the red lines are filters / computations) are applied to each. Each segment has quite different characteristics.
One can hopefully appreciate that if you compare the first and fourth image (rotated 180 degrees) here, that the computer will see not see the same parts of the sweet in one segment, so from its perspective this is a ‘new thing’ and it has to be informed, they are all in fact ‘one group’ which is to be classified as ‘one and the same’.
Above: inputting sweets in slightly different rotations, and with slightly different wrapper shapes into the Lion Vision programme. Safe to assume this is far simpler than with Vapes/Batteries in eWaste!
This is ‘obvious’ to humans, babies even, but this crudely illustrates why some patience is needed when training a Machine Learning algorithm for the first time, and if we’re fair, a baby has been ‘processing’ the world for many billions more images/sights - as well as sounds, touches, smells, tastes, etc. - all of which most computers are dumb too.
In time I anticipate computers will make better ‘assumptions’ and ‘inferences’, in the way ChatGPT may be working on your ‘question’, but is referencing a ‘world of data’ in parallel. So the likelihood of these systems becoming intertwined in years to come is of course inevitable, reducing the ambiguity of training data such as this.
Does that mean one should ‘bide one's time’? Yes and no - you can wait for companies to do all the hard work for you, and then buy their product(s), but one needs to accept two key factors, a. You will always be at the mercy of their algorithms and objectives, which they may change or render inoperable for your given application, and b. You will most certainly be being ‘mined’ for data for their exploitation.
3. Detection (Sensing Quality).
As alluded to with my point on a baby’s sensory input being not just ‘sight’ alone, but many other senses - touch, taste, hearing, smell, and sight! These of course have their ‘machine equivalents’, as illustrated below, and it’s worth remembering some problems may well need a different input, a shift in ‘spectrum’ beyond that of humans (e.g. Ultraviolet, Infrared, Ultrasonic, etc.), or indeed, a combination of inputs may be critical to obtaining a good result.
Above: A range of sensors, (optic, acoustic, force, chemical), all of which can in principle be inputted to a Machine Learning environment, so long as the ‘desired’ and ‘undesired’ bounds are easily defined.
As mentioned in Part 1, there are many such sensors available at RS Online, which ‘play nicely’ with the GPIOs of your chosen Machine Learning Single Board Computer (SBC), and these might interface using addressable I2C or SPI protocols.
However, as with my point in ‘good lighting’, this is really an extension of that - to say, good lighting is about ‘good imagery’. So, if you need to pick up an ‘audio input’ - the question remains, ‘how to pick a good microphone?’ is going to be critical to your success. Choosing quality - but also specific equipment/sensors to your given application will influence the results.
Ask any podcaster or reporter ‘will one mic do for everything?’ and they’ll laugh at you!
Above: Performance of different Microphone configurations, alongside a multitude of physical shapes and sizes - to illustrate even a ‘microphone’ has a wider range of specific use cases for different quality results. Credits: Left - Planetary Group. Right - ProMovieMaker.com.
The Raspberry Pi camera shown above is just a basic HD camera, which might be fine for your school project, but Lion Vision is using a considerably higher specification, probably many hundreds of pounds. It may well have anti-glare filters, anti-fog coatings, etc. So you need the right quality of equipment for the job, and for some applications, a ‘lo-fi’ input will do, and for others, you’ll need state-of-the-art.
With that said, as engineers we pride ourselves on being ‘clever’ not just ‘smart’, and it may be that for example you find that it’s hard to distinguish between two plastics, e.g. PP and PVC - one could use all manner of clever optical methods, or one could use flotation tanks, knowing PVC sinks and PP floats. This is of course what recycling facilities do, but it’s only obvious in hindsight!
4. Not Using a Sledgehammer to Crack a Nut.
When collaborating with George, at Lion Vision, I was very aware they had the ‘pro-grade’ NVIDIA processing capability, judging from the whirring of the ‘gamer-level’ computer under the desk. I have no idea, but it certainly cost thousands of pounds. It was most certainly a ‘sledgehammer to crack a nut’, but of course, this was to demonstrate the principles, whilst showing that one can of course ‘upgrade’ to a more complex equipment if the task demands it.
However, it must be stressed that if you don't have something as tricky as sorting batteries and vapes from eWaste, like Lion Vision, then this is where a small ‘starter kit’ like the c.£200 NVIDIA Jetson Nano, plus some cameras/microphones/etc., plus a smart graduate like George may be all you need to get started on exploring things. You may laugh that of course the ‘smart graduate’ is evidently the bigger expense if this is a new hire, but having worked at companies from Dyson to Sugru to LEGO, it does always seem to be the ‘youngbloods’ who enthusiastically bring such new tools and toys into a company. However, I had also seen that the ‘make-or-break’ moment is if the rest of the company has the humility to give these people and their new ideas and technologies a chance.
At Dyson, I co-founded ‘Open Ideas Day’, with Darren Lewis - it was (and still is!) a day a month where engineers and designers can work on anything new, even if it’s not necessarily related to what the business does today - and that’s the point - explore what it might be doing, tomorrow!
I was not the first to bring Arduino into Dyson, I cannot take credit for that. That was a number of young keen graduates and even a few scientists. However, after 6 months of emphatically preaching to engineering managers how ‘awesome’ this was, I got nowhere. The tipping point was when, in exasperation, I said ‘do you realise it’ll save huge amounts of time and money, by allowing designers to prototype electronics, perhaps meaning the electrical engineers spin-up 2 instead of 10 PCBs for a given ‘review milestone’’. I realised the classic error of youthful enthusiasm - I was so busy talking about how democratic, innovative, and multi-disciplinary Arduino was, I forgot to say it was also gonna save time, mistakes and money. I forgot to speak the language of business.
You might be thinking that Dyson was just cynically about money, but in every company I have worked, one needs to balance the ‘passion’ with the ‘pragmatism’. If you’re bankrupt, there are no cool ideas. So too with Machine Learning, I suggest that the emphasis be on trying small, agile projects, which prove their business case as soon as possible, and from that foundation, more hires, tools, and expertise can be built upon. It is not my place to speculate whether Lion Vision will become experts in checking for badly baked cakes, or scanning crops with drones, or helping cure cancer - but I think they are confidently ‘walking before they run’ - and think that message is applicable to many other companies yet to embark on a journey in Machine Learning. Safe to say, like Ghostbusters, you’ll now know who to call, where to buy your parts, and that it’s not too scary to get started on what will likely be an interesting first step into a new frontier.
5. Technically Correct, but Totally Wrong.
Above: Typical airport baggage scanning machine (for example only, not related to my story/point). Image Credits: MDetection.
I’ve had various conversations with fellow engineers about Machine Learning ‘pitfalls’, and I think these are always worth sharing. However, I’ve had so many I forgot who told me this, but I’ll retrospectively credit them if they remind me! (It may have been a scientist at UCL). Anyway, it’s a beautiful example of the delta between human perception and machine perception.
An engineer was working to create an AI detector for airport security. Similar to what George did at Lion Vision; you show it X,000 images of the ‘normal luggage’ and Y,000 images of ‘dangerous luggage’ - which perhaps has an explosive liquid in, or whatever. The ML is then supposed to learn the identities and refractions, or whatever ‘fingerprint’ of the ‘bad’ vs the ‘safe’ stuff. This is arguably a hybrid of supervised learning - telling it what is and isn’t okay, but also combined with unsupervised learning - as you allow the ML to categorise attributes it believes to be significant to your inputted categorisation.
The engineers had two *identical* x-ray scanners, and so on Machine A they scanned all the ‘bad’ stuff, and on Machine B they scanned all the ‘good’ stuff. They tested the machines after building the data model, with one fatal error, they kept using Machine A to ‘check’ the ‘bad’ and Machine B to check the ‘good’...you can perhaps already see where the issue is already!?
When they launched the product, with the ML training, it kept missing the ‘bad’ stuff. It was not even 50% confident, it was 100% confident it was ‘safe’ when they put through stuff that was 100% ‘bad’ (ie fake bombs, or toxic chemicals, etc.). It was spectacularly bad.
What has in fact happened was that the machines were ‘on paper’ - identical, but Machine A had a tiny defective spot. Perhaps a spec of dust was on the camera, so a pixel was always there for the ‘bad’ thing. The machine learned that ‘bad’ things have a pixel, say in the top left corner, and Machine B (which trained only ‘safe’ stuff) had no such dust/pixels. Consequently, the machine has ‘honed in’ on entirely the wrong trait of the problem. It did not have the ‘common sense’ to ignore the dust spec, and so assumed with 100% certainty that this was what a ‘bad’ thing was. It ignored all other other variables.
It was 100%, precisely, unequivocally, totally right - about the entirely wrong thing!
This is exactly why you need to train the model, and of course, test a variety of machines. When I told this story to George, he said that he had to be very careful to ‘exclude’ the difference between his lab conveyor belt - which was pristine, and green, from the SWEEEP (real world) one, which was black and gnarly and dirty from much use. He had to train the ML not just *what is* but also *what is not* a relevant thing pertaining to batteries/vapes, or in our case also sweets.
I think this is the best example / reminder of how to remember not to ‘think like a human’, but realise an ML will take any ‘short cut’ it can - it’s entirely how it’s coding works - to zoom in on the possible pattern, but it does not have the ‘big picture’ of the complexity of life to cross reference - yet! So until then, you need to explain it like it is a baby… and like being a parent, brace yourself for some repetition, mistakes and a lot of patience! But the reward will come, eventually!
Industrial AI Blog Series Contents:
Part 1: Lion Vision, AI vs Automation, and Why a Game of ‘Go’ Changed Everything.
Part 2: “Dirty, Dangerous, Difficult & Dull” - The Case for Ethical AI Automation.
Part 3: Key ML Terminology: Are You 'Sorting Ingredients' or 'Baking Cakes'?
Part 4: ML Lessons from Lion Vision. AI Failures, and ‘Sensing Like A Robot’.
Part 5: Getting Started with Jetson Nano / Orin. And Why Octopus Brains ML Marvels.
Part 6: A *Shiny* Idea, Whilst at Lion Vision: “Hi Vis Batteries”. And Why You Need Underdog Engineers.
Are you looking for additional information on Lion Vision?
Comments