Xinba Posted May 17, 2018 Share Posted May 17, 2018 Hi I have 7 AMD GPUs, GPU1 is AMD RX580 and the rest of GPUs are all AMD RX570. There are two bugs here. First is after opening the AIDA64, GPU 1, GPU 2, GPU 3, GPU 4 can show both GPU diode Temp and VRM Temp normally. However GPU 5, GPU 6 and GPU 7 are only showing GPU diode Temp, VRM Temp are not showing. Second is after nearly about one hour (maybe just 10 or 20 minutes, I didn’t keep looking at the clock) the GPU 5, GPU 6 and GPU 7 diode Temp just disappeared. Cooling Fans and Fan speeds for GPU 5, GPU 6 and GPU 7 are also disappeared. In the mean time these parameters for GPU 1, GPU 2, GPU 3, GPU 4 are still good. I’ve tried beta version, run AIDA64 as administrator, enabling the “Wake GPUs up at AIDA64 startup” but no luck. So I’m wondering if there will be a fix for this. Thank you ! Here is the log. Hope it helps before_disappeared_atigpureg.txt before_disappeared_atismbusdump.txt before_disappeared_isasensordump.txt after_disappeared_isasensordump.txt after_disappeared_atismbusdump.txt after_disappeared_atigpureg.txt Quote Link to comment Share on other sites More sharing options...
Fiery Posted May 18, 2018 Share Posted May 18, 2018 Thank you for the dumps. To me it looks like as if some of the GPUs in your system simply went to sleep after a while, due to lack of activity or load. Have you tried to disable ULPS? Quote Link to comment Share on other sites More sharing options...
Xinba Posted May 21, 2018 Author Share Posted May 21, 2018 Hi Fiery, Thanks for your reply. I followed this step (https://community.amd.com/thread/176003) from AMD community to disable the ULPS but the problem is no change. However I have a lucky finding that somehow the GPU5 GPU6 GPU7 Temp Sensors can survive much longer time after changing the “Update Frequency” settings. Here are more detailed test steps I have tried so far: Step#1: Go to Preferences --- Hardware Monitoring --- Update Frequency Step#2: All of the default settings for Update Frequency are 1000ms. So I changed these values to have a try and finally I got the result with some patterns as shown in below list. Step#3: Enabling the Logging function to log all the GPU Temp and GPU Fan speed data. (When the GPU Temp sensor is missing, the related GPU FAN speed value will become zero also) so that I can based on the time stamp to calculate an accurate survive time for GPU5 GPU6 GPU7 Temp Sensors in next step. Step#4: Reboot system and then run AIDA64 as Administrator to start my testing and logging Step#5: When the GPU5 GPU6 GPU7 Temp Sensors are missing, check the log files to calculate how long did these sensors survive. Step#6: Re-do Step#1 to #5 to re-test next setting for “Update Frequency”. List of my test result: Change all of the Update Frequency settings to 500ms ----- GPU5 GPU6 GPU7 Temp Sensors survived for 24 minutes Keep all of the Update Frequency settings as default 1000ms ---- GPU5 GPU6 GPU7 Temp Sensors survived for 29 minutes Change all of the Update Frequency settings to 2000ms ----- GPU5 GPU6 GPU7 Temp Sensors survived for 51 minutes Change all of the Update Frequency settings to 5000ms ----- GPU5 GPU6 GPU7 Temp Sensors survived for 1hour and 56 minutes Change all of the Update Frequency settings to 10000ms ----- GPU5 GPU6 GPU7 Temp Sensors survived for 3hours and 49 minutes Change all of the Update Frequency settings to 30000ms (which is the Max value) ----- So far so good for over night , I’m still keep it running to see if problem can disappear or not. Some other background settings: Before doing these steps, I checked all of the "EnableULPS" in windows Registry editor are "0" (ULPS disabled). After doing these tests, I checked these again, every "EnableULPS" in windows Registry editor are still keeping in "0". When doing these tests, I always keep the “Wake GPUs up at AIDA64 startup” settings enabled . Also I attached a picture for how my “Update Frequency” settings look like. The rest of AIDA64 settings are all keep as default. Hope my lucky finding helps. Thank you ! Quote Link to comment Share on other sites More sharing options...
Xinba Posted June 1, 2018 Author Share Posted June 1, 2018 Here are some update: Change all of the Update Frequency settings to 30000ms (which is the Max value) ----- GPU5 GPU6 GPU7 Temp Sensors survived for 11hours and 22 minutes It looks like the longer the milliseconds set for "Update Frequency" the longer time the GPU5 GPU6 GPU7 Temp Sensors can survive. I kept my systems no change and only switch to use HWinfo64_v5.82 formal version to run the same test for 24 hours. All of the GPU Temp Sensors are always good without any problem. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.