Samsung and Xilinx Partner for Industry’s First Customizable, Programmable Computational SSD

Peter_Brosdahl · Nov 14, 2020

Image: Samsung

SSDs are continually advancing. While the norm has been to rely on die shrinks, added layers, controllers, and port specifications, Samsung and Xilinx have partnered for an innovation called the SmartSSD Computation Storage Device, which shifts compression duties from the CPU onto the storage device itself. Here’s a brief description.

The SmartSSD CSD performs high-speed computations on data where it is stored. Combining a high-performance Samsung Enterprise SSD and an acceleration-dedicated Xilinx Kintex Ultrascale+ FPGA with a fast private data path between them, the SmartSSD CSD enables efficient parallel computation on data itself. This unlocks massive performance gains and dense, linear scalability while freeing the CPU to handle...

Ready4Droid · Nov 14, 2020

Did AMD purchase xilinix yet? Seems like this xilinix thing might be a good thing for them.

Grimlakin · Nov 15, 2020

It's neat but you can throw a lot of that performance out if you are compressing your SQL DB on the fly. Though in reality to take advantage of potentially better compression you should let your storage array handle deduplication.

Here's my question if you are running a storage array that has deduplication (what compression is) built in how will the drives be better, or is that the point of the storage array.

Also some storage arrays have compression happening at the controller level before the data is written to disk. So a new design of a storage array would be needed to properly leverage the drive and remove the computation cost from the storage array.

Then I wonder if it will actually be a cost savings for enterprise. The more I think about it I see this as a great midmarket device. Get yourself a home NAS populated with 5 or 6 of these (one as a hot spare maybe) and you can save off 50TB of uncompressed data.

Also the idea of running one or two of these as a game Library disk setup makes a lot of sense as well. Especially with the exploding size of game installs like COD. Hell Destiny 2 is 92 gb by itself. Some hardware based compression would do that wonders.

LazyGamer · Nov 15, 2020

Grimlakin said:
It's neat but you can throw a lot of that performance out if you are compressing your SQL DB on the fly. Though in reality to take advantage of potentially better compression you should let your storage array handle deduplication.

Here's my question if you are running a storage array that has deduplication (what compression is) built in how will the drives be better, or is that the point of the storage array.

Also some storage arrays have compression happening at the controller level before the data is written to disk. So a new design of a storage array would be needed to properly leverage the drive and remove the computation cost from the storage array.

Then I wonder if it will actually be a cost savings for enterprise. The more I think about it I see this as a great midmarket device. Get yourself a home NAS populated with 5 or 6 of these (one as a hot spare maybe) and you can save off 50TB of uncompressed data.

Also the idea of running one or two of these as a game Library disk setup makes a lot of sense as well. Especially with the exploding size of game installs like COD. Hell Destiny 2 is 92 gb by itself. Some hardware based compression would do that wonders.

For compression to work well, it has to be tuned to the dataset. When you take a look at ZFS for example, you can set a default level of compression for the entire pool, but then you can adjust that on an individual dataset basis; you can also adjust block size and file-level deduplication as well.

The challenge is that the FPGA needs to be tuned itself for the data, which while entirely possible, is likely not nearly as flexible as doing it on the CPU. So I think the utility of this solution is going to depend heavily on how flexible the FPGA is implemented.

The other part of this aside from compression -- which you perhaps may not even want on the drive itself, because then the interconnect becomes a limitation for moving uncompressed data -- is the more basic idea of doing work directly on the data on the drive. To me, this is the real edge case: in order for the drive to work on its own data, the drive must have access to all the data, which means no striping the data across multiple drives, which means drives must be mirrored for data redundancy, no parity-based solutions (RAID5 / RAIDZ etc.). That seems like a pretty big downside to me!

Grimlakin · Nov 15, 2020

LazyGamer said:
For compression to work well, it has to be tuned to the dataset. When you take a look at ZFS for example, you can set a default level of compression for the entire pool, but then you can adjust that on an individual dataset basis; you can also adjust block size and file-level deduplication as well.

The challenge is that the FPGA needs to be tuned itself for the data, which while entirely possible, is likely not nearly as flexible as doing it on the CPU. So I think the utility of this solution is going to depend heavily on how flexible the FPGA is implemented.

The other part of this aside from compression -- which you perhaps may not even want on the drive itself, because then the interconnect becomes a limitation for moving uncompressed data -- is the more basic idea of doing work directly on the data on the drive. To me, this is the real edge case: in order for the drive to work on its own data, the drive must have access to all the data, which means no striping the data across multiple drives, which means drives must be mirrored for data redundancy, no parity-based solutions (RAID5 / RAIDZ etc.). That seems like a pretty big downside to me!

Yea Good gravy I didn't even consider that... unless they use the controller medium to set up a sort of grid array for processing data. The more disks the more powerful the grid.

This is a super niche device. Still curious as I think smaller installations can work.

Zarathustra · Nov 16, 2020

Is this an enterprise feature? Generally I tend to try to avoid compression of storage due tot he performance penalties.

Grimlakin · Nov 16, 2020

Zarathustra said:
Is this an enterprise feature? Generally I tend to try to avoid compression of storage due tot he performance penalties.

That really depends on the scale of your hardware. Once you are into VNX's and Unity systems (or whatever the equivalent is from competitors.) then the speed on the storage subsystem for things like Deduplication isn't very impactful at all. Especially when you're talking about Nanosecond seek times.

LazyGamer · Nov 16, 2020

Given that this is close to storage, while compression seems like a decent application, and may still be in play additionally, it's the ability to do low-latency work on the data that'd otherwise be 'at rest' that seems more attention-getting. Something like indexing data or even sorting it locally, or running some lightweight machine-learning algorithm makes a bit of sense.

It also makes sense when you start to consider the sheer potential storage density just around the corner. 100TB per device should be feasible in the near future, and unless the data is properly indexed before it's stored, figuring it out after the fact, which may be necessary depending on how the data is captured or if its use changes significantly or just often enough to make indexing at ingestion less ideal.

Samsung and Xilinx Partner for Industry’s First Customizable, Programmable Computational SSD

Peter_Brosdahl

Moderator

Ready4Droid

Sort-of-Regular

Grimlakin

Forum Posting Supreme

LazyGamer

FPS Junkie

Grimlakin

Forum Posting Supreme

Zarathustra

Cloudless

Grimlakin

Forum Posting Supreme

LazyGamer

FPS Junkie