You kinda can. I'm no math wiz, but I did take a Statistics class. You may not know with multi-decimal point precision, but for what we are all looking at the numbers for - it's more than accurate enough.
Yeah,
So in general, with variable data, unless the standard deviation is huge, some 30 data points is the rule of thumb for relatively decent accuracy. Can't quantify that without knowing the actual sample size and the standard deviation, but I'm sure they sample at least a few thousand every month, so that is well in excess of 30.
You could take the binomial attribute figures as a worst case, and that that point you get to 95% confidence of 95% reliability with only 59 samples. 95% confidence of 99% reliability with 299 samples. So the few thousand should be more than enough.
This - of course - relies upon your sample being truly random, which is where most polls that fail fail, in their inability to get a truly random and representative sample.
If - for instance - your 30 samples all fall in high income countries/regions, the results are going to be very different than if they all fall in lower income countries/areas, etc.
And it gets really complicated in a hurry when trying to correct for all of that, and making sure your sample includes a representative mix of people from regions, income levels, genders, etc. etc. This is what the political pollsters try to do, and while they are often pretty good at it, when they fail it is almost always because they made an incorrect assumption about their corrections to try to make the sample represent the voting public, usually around who to consider a "likely voter" as certainly elections bring out people who have never voted before and never will again, depending on the news/politics of the day.
That's why I'm usually in favor of just getting as many samples as you possibly can, as long as they aren't too difficult or costly to collect. With Steam, it should be trivial both cost wise and work wise to just collect the data from ALL machines that Steam is installed on.
I mean, it will require a decent sized database the millions of Steam client installs, but they probably need that already as it is.
Even then though, you need to decide how to and if you should correct the data. For instance, do you weigh all clients equally? A client that is logged in and playing games many hours a week, is probably more significant than someone who logs on once every couple of months for some light gaming. So maybe you weight the data points by how many hours the client has been actively playing a title during the time point.
Is that more representative? Or is it worse? Because now you are probably weighting things towards more serious gamers, who probably are more likely to prioritize their hardware, and you may loose out on the casual occasional gamers. If the stats are used by game devs to help them figure out what their target audience is using, and how to optimize titles, do you want a game that only super serious gamers can run, or do you want something you can sell to the broadest range of potato computer users?
You have to try to predict who the consumer of the data is, and what their priorities are. From the hardware enthusiasts crowd this weighting towards those who play lots of hours is probably preferable as it helps you see what your fellow enthusiasts are using. From a game dev perspective, they might be more interested towards something weighted towards those who spend more or games (maybe even their specific genre/type of game). But as soon as you do that, you may be down-weighting
potential customers who might ahve bought a game if they only thought they had the hardware to be able to run it.
So it gets complicated in a hurry, and stats, although they are a way to take a bolus of confusing data and turn it into something more black and white for decisionmaking purposes, can really be influenced by a lot of these decisions in the process.
The gold standard would be to have the raw data and be able to analyze it yourself, but that will likely never happen for obvious reasons.
Either way, the steam Hwsurvey is one of the biggest datasets we have, and although we can argue about if their sample size is large enough, if it is representative and if it is weighted right, it can still help us learn something we otherwise didn't know, and that is valuable IMHO