Jump to content

7 AMD GPUs and only 4 Temp sensors are still available after nearly one hour


Xinba

Recommended Posts

Hi

   I have 7 AMD GPUs, GPU1 is AMD RX580 and the rest of GPUs are all AMD RX570. There are two bugs here.

   First is after opening the AIDA64, GPU 1, GPU 2, GPU 3, GPU 4 can show both GPU diode Temp and VRM Temp normally. However GPU 5, GPU 6 and GPU 7 are only showing GPU diode Temp, VRM Temp are not showing.

  Second is after nearly about one hour (maybe just 10 or 20 minutes, I didn’t keep looking at the clock) the GPU 5, GPU 6 and GPU 7 diode Temp just disappeared.  Cooling Fans and Fan speeds for GPU 5, GPU 6 and GPU 7 are also disappeared.  In the mean time these parameters for GPU 1, GPU 2, GPU 3, GPU 4 are still good.

  I’ve tried beta version, run AIDA64 as administrator, enabling the “Wake GPUs up at AIDA64 startup” but no luck.

  So I’m wondering if there will be a fix for this. Thank you !

 

  Here is the log. Hope it helps

 

before_disappeared_atigpureg.txt

before_disappeared_atismbusdump.txt

before_disappeared_isasensordump.txt

after_disappeared_isasensordump.txt

after_disappeared_atismbusdump.txt

after_disappeared_atigpureg.txt

Link to comment
Share on other sites

Hi Fiery,

  Thanks for your reply.  I followed this step (https://community.amd.com/thread/176003) from AMD community to disable the ULPS but the problem is no change.

 

  However I have a lucky finding that somehow the GPU5 GPU6 GPU7 Temp Sensors can survive much longer time after changing the “Update Frequency” settings. Here are more detailed test steps I have tried so far:

 

  Step#1: Go to Preferences --- Hardware Monitoring --- Update Frequency

  Step#2: All of the default settings for Update Frequency are 1000ms.  So I changed these values to have a try and finally I got the result with some patterns as shown in below list.

  Step#3: Enabling the Logging function to log all the GPU Temp and GPU Fan speed data. (When the GPU Temp sensor is missing, the related GPU FAN speed value will become zero also) so that I can based on the time stamp to calculate an accurate survive time for GPU5 GPU6 GPU7 Temp Sensors in next step.

  Step#4: Reboot system and then run AIDA64 as Administrator to start my testing and logging

  Step#5: When the GPU5 GPU6 GPU7 Temp Sensors are missing, check the log files to calculate how long did these sensors survive.

  Step#6: Re-do Step#1 to #5 to re-test next setting for “Update Frequency”.

 

  List of my test result:

  Change all of the Update Frequency settings to 500ms  ----- GPU5 GPU6 GPU7 Temp Sensors survived for 24 minutes

  Keep all of the Update Frequency settings as default 1000ms  ---- GPU5 GPU6 GPU7 Temp Sensors survived for 29 minutes

  Change all of the Update Frequency settings to 2000ms  ----- GPU5 GPU6 GPU7 Temp Sensors survived for 51 minutes

  Change all of the Update Frequency settings to 5000ms  ----- GPU5 GPU6 GPU7 Temp Sensors survived for 1hour and 56 minutes

  Change all of the Update Frequency settings to 10000ms  ----- GPU5 GPU6 GPU7 Temp Sensors survived for 3hours and 49 minutes

  Change all of the Update Frequency settings to 30000ms (which is the Max value)  ----- So far so good for over night , I’m still keep it running to see if problem can disappear or not.

 

  Some other background settings:  Before doing these steps, I checked all of the "EnableULPS" in windows Registry editor are "0" (ULPS disabled). After doing these tests, I checked these again, every "EnableULPS" in windows Registry editor are still keeping in "0". 

  When doing these tests, I always keep the Wake GPUs up at AIDA64 startup” settings enabled .  

  Also I attached a picture for how my “Update Frequency” settings look like.

  The rest of AIDA64 settings are all keep as default.

 

  Hope my lucky finding helps. Thank you !

 

 

30000ms.PNG.75619137323b3f9fc3c03c7f204a307e.PNG

Link to comment
Share on other sites

  • 2 weeks later...

Here are some update:

Change all of the Update Frequency settings to 30000ms (which is the Max value)  ----- GPU5 GPU6 GPU7 Temp Sensors survived for 11hours and 22 minutes

It looks like the longer the milliseconds set for  "Update Frequency" the longer  time the GPU5 GPU6 GPU7 Temp Sensors can survive.

 

I kept my systems no change and only switch to use HWinfo64_v5.82 formal version to run the same test for 24 hours.  All of the GPU Temp Sensors are always good without any problem.

 

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×
×
  • Create New...