Silicon Photonics for AI (6:00pm PT)

July 2024 ยท 5 minute read

08:28PM EDT - Lightmatter developers silicon photonics based AI systems

08:29PM EDT - A new type of computer

08:29PM EDT - Lightmatter Mars

08:29PM EDT - multi-chip solution

08:31PM EDT - Workloads grow out to datacenter scales

08:31PM EDT - a new hw approach is needed

08:31PM EDT - Using standard CMOS

08:31PM EDT - Optical transport

08:32PM EDT - Perform computation in the optical domain, even in parallel

08:32PM EDT - 1000x faster than electronics at 10x speed with 1000x les pwoer for the same die area

08:32PM EDT - Single MAC at microwatt in photonics vs a milliwatt in electronics

08:32PM EDT - Takes about the same area

08:33PM EDT - 10s watts for data transport - free with optics

08:33PM EDT - Free from RC time constants

08:33PM EDT - 10s of watts to single digit microwatts

08:34PM EDT - MZI phase shift interference detection

08:34PM EDT - Mach Zehnder Interferometer

08:34PM EDT - Interference creates a multiplier

08:34PM EDT - No fundamental energy required

08:34PM EDT - near-zero

08:35PM EDT - independent of process, voltage

08:35PM EDT - Many ways to build phase shifters

08:35PM EDT - Mars uses Nano Optical Electro Mechanical System NOEMS

08:35PM EDT - Run at 100s of MHz vs 10s of kHz

08:36PM EDT - Mars uses a mechanical solution

08:36PM EDT - effect the refractive index

08:36PM EDT - suspend in air during manufacture and etch under it

08:36PM EDT - Cdyn is super low

08:37PM EDT - Optical Vector MAC

08:37PM EDT - Directional couplers

08:38PM EDT - 2x2 matrix multiplied by a 1x2 vector

08:38PM EDT - At speed of light, almost zero power

08:38PM EDT - Array of MZIs

08:38PM EDT - Build large matrix vector structures

08:38PM EDT - 1000x1000 or larger

08:39PM EDT - 1000s MACs per 100ps

08:39PM EDT - limitation is surrounding electronics

08:39PM EDT - High speed data photonics at the edge

08:39PM EDT - Performance scales with area

08:39PM EDT - power scales with sqrt(area)

08:40PM EDT - 64 DAC and 64 ADC = 4096 MACs

08:40PM EDT - Limit is pushing the weights into the array

08:40PM EDT - 3 orders of magnitudes of order faster than electronics

08:41PM EDT - Each element can take multiple data points - parallel processing

08:42PM EDT - optics in different colors etc

08:42PM EDT - like fiber optics

08:42PM EDT - 1 GHz vector rate - set my data conversions

08:42PM EDT - 50mW laser

08:42PM EDT - 90nm GloFo standard photonics process

08:42PM EDT - 150mm2

08:42PM EDT - yield very well

08:43PM EDT - Mars SoC 14nm custom ASIC

08:43PM EDT - mm2

08:43PM EDT - Analog interfaces to Photoics

08:43PM EDT - SRAM for weights and activations

08:44PM EDT - single fully synchronous pipeline scehduler

08:47PM EDT - 3W TDP...

08:48PM EDT - Most power is data movement

08:48PM EDT - 3D integration

08:49PM EDT - optical core and ASIC are stacked

08:49PM EDT - Laser power coming in from external to chip

08:49PM EDT - Support for ML Frameworks - Pytorch, TensorFlow, ONNX

08:51PM EDT - Q&A Time

08:51PM EDT - Q: 3W TDP? A: That's for SiPh, Laser, SoC, everything

08:52PM EDT - Q: Perf on Resnet? A: Not publishing yet, but we have simulator results and demo chips in the lab

08:52PM EDT - Q: MLPerf? A: We're working on it!

08:53PM EDT - Q: Models bigger than on-system memory? A: We know it's an important problem to solve! We think we can solve it through photonics. Looking at scale out solutions for training and inference

08:54PM EDT - HBM roadmap is good, but it doesn't scale with the BW that we need, so we need solutions that scale at factors of 10, that's what we're look at

08:56PM EDT - Q: How robust are the inteferometers? A: MEMS have a yield - you can enhance based on scales on feature sizes. Reliability - these things have to live in a datacenter for 10 years, so we're looking at robust devices. High reliability MEMs devices technology has been around a while. These devices are really small, so these are tiny - we're not pushing to a limit. These are tiny movements to affect the effective refractive index

08:56PM EDT - Q: DAC precision? A: Not as critical as you think. Generally 8-bit DAC. We can scale to 12-bit and still build high perf system. We're building something that matches the rest of the industry. DAC/ADC are the rate limits of the design

08:57PM EDT - Q: 200ps, does that include digital? A: No, just the photonics and analog

08:58PM EDT - Q: Other neural networks? A: We're looking at them, but our goal is a general purpose accelerator. We have space on the 14nm ASIC of course

08:58PM EDT - Q: Limitations on weight matrices? A: No, we can represent any matrix

08:59PM EDT - That's the end of Hot Chips! I hope you've enjoyed the Live Blogs. There's so much to piece through after the fact. The slack channel for attendees was going crazy. There's a small wrap up talk for a few minutes

09:00PM EDT - 2294 total participants this year!

09:00PM EDT - 2302! this slide was made too early

09:00PM EDT - Last year was a record 1250

09:01PM EDT - ~3.5% press in previous years

09:02PM EDT - Public slide decks and videos will be available later this year

09:07PM EDT - Thanks to everyone for tuning in. If you loved our content, sign up next year to watch it live :)

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH53fJBpZqGnpGKwqbXPrGRraGJleq211Z5km6SfnHq0tcuimqimXaW1sMDOp6Ccq12bvLN5wKJkb2hgpbpuvNM%3D