Z's 4090 Thread

Ugh.

I am having a nightmare scenario with this 4090.

The water block from EKWB came early, so I decided to get a start installing it.

I took the stock backplate and cooler off of the MSI board only to discover that one of the caps is loose. Like almost coming off the board.

I didn't apply any kind of force to the thing disassembling it, so I don't think I could have done it to the board. It must have been that way from the factory. Maybe failed wave solder or something.

I think the stock cooler applies pressure to the cap that keeps it in contact with the board, but as soon as I take it off it comes loose.

The EKWB waterblock does not apply pressure in the same way to the caps, so I can see the caps through the plexi flapping with motion...

So. I reinstalled the factory cooler and the GPU is back in the testbench and I am stress testing it now and it works just fine, but it probably wouldn't without the pressure on the cap keeping it in place, and who knows about longevity even with thestock cooler pushing the cap in place?

So the question is what to do.

1.) I could play coy and RMA it and say it isn't working, but...

a.) its going to be pretty obvious I took the cooler off, as I had to rip the little MSI sticker off one of the screws. I know it is illegal to deny warranty in the U.S. if you worked on something yourself, but that usually results in a fight with the warranty department.

b.) if they just stick it in a test machine and start running stress tests on it, they are going to find that that it works just fine, and may even send it back as "no problem found"...

2.) Or I could just straight up tell them, I took the cooler off, and there is a loose cap on the board, and then see how that fight goes.

Anyone have any suggestions?
 
I take no responsibility for this, but I'd start with the honest answer. See if you can get video documentation of the problem to bolster the cause behind your concern.

I appreciate your opinion. This is the approach I was leaning towards as well, but I wanted to send it out there and see if anyone thought I was being crazy for being honest with a GPU company considering they always seem to look for ways to screw us over.
 
I take no responsibility for this, but I'd start with the honest answer. See if you can get video documentation of the problem to bolster the cause behind your concern.

Well, I went ahead and did it.

The good news? The RMA request seems to be automatically approved.

They - of course - reserve the right to reject it when they receive it, so that might be when the fight starts.

They also have a 15-35 day turnaround excluding shipping, and they are in CA, opposite side of the country, so I am counting on 5-7 days each way.

See you guys in the late February/ early March timeframe... *sigh*
 
Last edited:
Well, I went ahead and did it.

The good news? The RMA request seems to be automatically approved.

They - of course - reserve the right to reject it when they receive it, so that might be when the fight starts.

They also have a 15-35 day turnaround excluding shipping, and they are in CA, opposite side of the country, so I am counting on 5-7 days each way.

See you guys in the February/March timeframe... *sigh*
I'm glad the RMA process has started and so sorry you're having to through that nightmare. Regardless of the component, I always hold my breath the first time I do a power-up and then again when I make any hardware changes to a rig. It doesn't matter how many times I've done something, I still get nervous.
 
I'm glad the RMA process has started and so sorry you're having to through that nightmare. Regardless of the component, I always hold my breath the first time I do a power-up and then again when I make any hardware changes to a rig. It doesn't matter how many times I've done something, I still get nervous.

Same.

I may not have had a failure like that in many years, but I still cross my fingers when I press the button for the first time.

In this case it's weird. The CAP is literally loose, with one of the two legs not touching the board with the heatsink off, and the other looks like it is hanging on by a thread, but the board works perfectly and passed hours of stress tests (I did time spy on a loop) last night after I discovered it.

I feel very confident I did not damage the thing. It looks like a fairly typical incomplete solder issue to me that happens from time to time in wave-solder applications. It should have been caught at the factory, but manufacturers are imperfect.

I am just concerned that they are going to blame me for it, since I took the cooler off. I am prepared for a war of words, and I will drag MSI's name through the mud on every platform I can if they reject my RMA.

I'm seriously bummed though. I may be in my 40's, but new GPU's make me giddy as a kid in a candy store, and now I'm going to ahve to wait probably another ~50 days before I can actually use the thing, even if all goes well with MSI.
 
I'm seriously bummed though. I may be in my 40's, but new GPU's make me giddy as a kid in a candy store, and now I'm going to ahve to wait probably another ~50 days before I can actually use the thing, even if all goes well with MSI.
I'm in my 50s and they are as bad as any addiction I've ever faced but I also get as happy as can be whenever I get one up and running. Fingers crossed for you!
 
I'm in my 50s and they are as bad as any addiction I've ever faced but I also get as happy as can be whenever I get one up and running. Fingers crossed for you!
Same here, and this hobby is what keeps me happy other than my wife.
I may be in my 40's, but new GPU's make me giddy as a kid in a candy store
I hope the turn around is a positive experience and you get the card back in your hands soon.🤞
 
So, MSI wound up swapping out the GPU for a new one which arrived today!

I'm glad MSI wound up not being like certain brands (which shall remain unnamed right now) which have fought me on warranty claims in the past. I'm happy to recommend them based on that.

Time to pop this bad boy in the test-bench for some pre-waterblock installation stress testing!
 
Just a heads up, because I spent a bunch of time with mine last weekend and did more overclocking experiments. Pretty much not worth it beyond bragging rights. I even did another Timespy run and got maybe 200 more points than my last run at stock. Depending on the game I could get mine to hold in the low-mid 2900s and ~11700 just like what Brent managed in his OC review but that only got maybe a few extra FPS and I've also read other overclocking reviews for these where in some circumstances the reviewer saw decreased FPS performance. With mine the stock boost hits 2750 and with the stock mem, it performed within 1-5 FPS of being overclocked, perfectly stable, and the fans were totally silent, and power was mostly in the 300-400w range. I still saw some spikes where power went higher but it was even more so with the overclocks for a minimal gain.

I do admit though, I'll be very interested in your OC numbers once the block is on but I still don't believe that OC is worth it on these.
 
Just a heads up, because I spent a bunch of time with mine last weekend and did more overclocking experiments. Pretty much not worth it beyond bragging rights. I even did another Timespy run and got maybe 200 more points than my last run at stock. Depending on the game I could get mine to hold in the low-mid 2900s and ~11700 just like what Brent managed in his OC review but that only got maybe a few extra FPS and I've also read other overclocking reviews for these where in some circumstances the reviewer saw decreased FPS performance. With mine the stock boost hits 2750 and with the stock mem, it performed within 1-5 FPS of being overclocked, perfectly stable, and the fans were totally silent, and power was mostly in the 300-400w range. I still saw some spikes where power went higher but it was even more so with the overclocks for a minimal gain.

I do admit though, I'll be very interested in your OC numbers once the block is on but I still don't believe that OC is worth it on these.

What limit do you think you are hitting first when trying to overclock?

Heat? Voltage? Power?

I wound up with the version of the 4090 that has the least power delivery at a max of ~450w, but judging by what people are saying, power isn't really the biggest limiting factor on these.

That, and maybe if I get it cool enough, it will consume less power as well.
 
Heat? Voltage? Power?
Heat - nope, with fans on full it was under 60 and often even in the low 50s, I was even able to pull them down to around 70-80% and keep close to the same.

Voltage - maybe. Never adjusted it.

Power - Doubtful. Got the actual 12VHPWR connector going to a 1200W supply. David taught me a long time ago to actually dial back the memory due to high clocks pulling more power on the board and thus causing the GPU to get less.

It's possible the 5800X3D might be a limiting factor.

I haven't read other reviews, but it sounds like they jacked memory speeds too high and were penalized by throwing errors that got ECC corrected...
Trying to find that review right now. I read it about a month or two ago but at the time I was reading a bunch so I can't remember which site it was.

By the way has anyone here experimented with turning on the EC mode with any of the Ampere or Ada cards yet? I noticed the setting in CP last week and was googling it and there are a few posts out there on Reddit but I didn't see any professional reviews talk about it yet. Evidently NVIDIA enabled it with either the 3090 or 3090 Ti and kept it for the 4090 as well.

1674785602218.png

Here's the settings that I was using that we're mostly stable but depending on the game would crash. Witcher 3 got really finicky during various cutscenes. Metro Exodus was a bit nicer but would occasionally crash out of the blue. Timespy ran fine.

1674785763076.png

1674786450269.png

1674786499860.png
 
Heat - nope, with fans on full it was under 60 and often even in the low 50s, I was even able to pull them down to around 70-80% and keep close to the same.

Voltage - maybe. Never adjusted it.

Power - Doubtful. Got the actual 12VHPWR connector going to a 1200W supply. David taught me a long time ago to actually dial back the memory due to high clocks pulling more power on the board and thus causing the GPU to get less.

It's possible the 5800X3D might be a limiting factor.


Trying to find that review right now. I read it about a month or two ago but at the time I was reading a bunch so I can't remember which site it was.

By the way has anyone here experimented with turning on the EC mode with any of the Ampere or Ada cards yet? I noticed the setting in CP last week and was googling it and there are a few posts out there on Reddit but I didn't see any professional reviews talk about it yet. Evidently NVIDIA enabled it with either the 3090 or 3090 Ti and kept it for the 4090 as well.

View attachment 2271

Here's the settings that I was using that we're mostly stable but depending on the game would crash. Witcher 3 got really finicky during various cutscenes. Metro Exodus was a bit nicer but would occasionally crash out of the blue. Timespy ran fine.

View attachment 2272

View attachment 2274

View attachment 2275

Good info, thanks.

I know next to nothing about GPU ECC, but if it behaves like main system RAM, enabling ECC will probably just result in worse timings, with little to no benefit for gaming loads. Very useful for important compute work though, where a flipped bit can ruin a lot of peoples research. In gaming, what's going to happen? Once or twice a year on one frame that is displayed for at most 16.7ms one pixel might be the slightly wrong shade of green :p

I think the performance from better timings is probably a priority over ECC for gaming loads.
 
Last edited:
Just tried a little more experimenting. I think maybe I just didn't win the silicon lottery with this one. In trying to find that review again, still haven't, but googled a bunch of suprim x liquid reviews and most show it capable of OCing upwards of 3 GHz and sometimes over. I'm hitting a wall. I've seen it hold 2910, 2925, and 2940, and then crash at 2955. I also removed the mem OC. Ultimately I've got +160 on the core, a +110 power limit (which for this card is 530W). One thing I've noticed is that stability during benching doesn't always transfer to gaming. I've spent hours benching only to switch over to a game and see it randomly crash within 30 minutes.

The good news. By not using the mem OC I almost never saw power go over 500W in running the canned benches for CB2077, Metro Exodus, and SOTTR. It seems by using the stock mem speed it reduced around 15-30W. I figured since it was using less power might as well let it try the stock fan curve and it was keeping it 62-65c with around 50-60% fan speeds, often lower in some spots.

Meanwhile, for kicks, I do recommend that if you have any of the newer RE games (there's a demo for the 2nd out there), tweak the IQ setting to 200%. Check out that VRAM usage. The image is washed out because it was captured while using HDR.


re3_2023_01_26_20_29_00_469.jpg

In the end, though I wouldn't be too concerned with the power limits of the model you have. Most of the OC headroom is between 430-500W and in terms of temps I've seen it hold 2910 even at 64c.

I did notice that Precision X1's voltage adjuster is unlocked so maybe I'll try tweaking that a bit. I've heard there is a version of AB that can do it but stays greyed out even though I've got it checked in the settings. However, I read on one of those reviews for the Suprim X that NVIDIA didn't want voltage unlocked in order to prevent people from damaging the GPU.

These guys got theirs up to 3090 Mhz.
https://hothardware.com/reviews/msi-geforce-rtx-4090-suprim-liquiq-x-review?page=5
 
Here's an example, as close as I could take the snapshot because this benchmark has wildly changing power but I tried to get the same frame spot with and without the mem OC.
First with and see it spikes at ~528W, I did another run and saw ~525 here as well
MetroExodus_2023_01_26_21_03_30_540.jpg


and now without and it spikes ~521W. I did another run and it was closer to 516W

MetroExodus_2023_01_26_21_27_16_226.jpg
 
Anyway, I mean to post all of this sooner, but I got sidetracked by my "loose cap" issue.

So to start. Everything is huge. The GPU with the stock cooler is huge. So is the box the EKWB water block comes in.

1674967438908.png

Seriously, the EKWB box looks like it should contain a 420mm (3x 140mm) radiator!

1674967526837.png

Inside we have the water block, packaged peretty nicely, and the organge accessory boxes which include such things as screws, thermal pads G1/4 plugs for the unused ports, etc.

Notably absent was the instruction manual (they tell you to get it online) and while the founders edition of the water block comes with a single slot width PCIe IO panel thingie, this version does not, which means that if I don't find an alternative (and nothing I've seen on ebay seems to look like it fits MSI's mounting holes) I'll be stuck using three slots for what now is a single slot GPU.


As can be seen, the water block will make this GPU way more reasonable in size:

1674967747378.png

1674967789614.png

1674967830464.png
 
Time to get started (Sprite can for size reference, I don't actually drink the sugary stuff)

I'm not usually a big anti-static mat and wrist strap guy, preferring to just exercise caution and not touch anything sensitive, but this is the most expensive GPU I've ever bought by a wide margin, and when I started work, it was 20% RH in my office, so I dindn't want to take any stupid risks.

1674967926837.png


The instructions say to remove 12 screws from the backplate. Turns out there are actually 14 screws.

1674968063191.png

I vaguely remember this being an issue with the instructions for the water block on my Pascal Titan X back in 2016 as well. Either they count differently in Slovenia, or GPU manufacturers have a nasty habit of changing the design between when the manual is written and when the GPU hits the market...

After that the backplate just lifts right off:

1674968197560.png

Then you undo the 4 screws, remove the bracket behind the chip, and just lift the board right off the cooler:

1674968393515.png

This is where I discovered my capacitor problem on my first GPU, and stopped:

1674968449984.png


1674968482001.png

I tried straightening the cap, but whenever I didn't press on it, the right leg lifted up off of the pad. A user on the hardforums who has worked with wavesolder operations suggested that there was probably insufficient solder printed on the board prior to pick and place, and it just didn't flow enough.

1674968776002.png

If you look closely you can see that the leg is completely clear of the solder pad. I'm guessing the stock cooler just held it in place once installed, as both before removing the cooler, and after reinstalling it, the GPU worked just fine in stress tests.

This is where I RMA'd the GPU. Luckily that went smoothly. Rather than repair it, they decided to replace it, so I got a new brown box version sent back to me.

For a while I though it got stolen during shipping, because it made its way with Fedex Ground nicely from California all the way to Connecticut, where it stopped for 3-4 days. I was totally expecting them to say they lost the package, but then after the wait, it resumed its travels and the new board arrived!
 
Become a Patron!
Back
Top