Logging CRON messages to a different file

Noticed that a lot of the messages in /var/log/syslog are from CRON and anacron?  Would you prefer that they be directed to a different file?  Easy Peasy.

I use cron to schedule the execution of a number of programs that perform a variety of tasks (e.g. backups).  By default, Ubuntu 18.04 is configured to log everything that cron does to /var/log/syslog.  I previously explained how to suppress messages from certain programs, but in this case what I just want to do is have the messages logged to a different file.

To accomplish that, you’ll need to edit rsyslog’s configuration file:

$ sudo nano /etc/rsyslog.d/50-default.conf

and edit the two lines that look like this:

*.*;auth,authpriv.none   -/var/log/syslog
#cron.*                  /var/log/cron.log

so they look like this:

*.*;auth,authpriv,cron.none  /var/log/syslog
cron.*                       /var/log/cron.log

Basically, you want to accomplish two things:

  1. You want to stop logging cron messages to /var/log/syslog
  2. You want to start logging cron messages to /var/log/cron (or some other file)

Adding ‘,cron’ to the first line accomplishes 1.  Removing the ‘#’ from the second line accomplishes 2.

For the changes to take effect, just restart the rsyslog daemon:

$ sudo service rsyslog restart

The next time a cron job runs /var/log/cron will be created and that’s where cron will log all of its messages from that point on.  Ditto for anacron.

That’s it.  You’re done.  Enjoy!

But wait…

Q:  Can you explain how *.*;auth,authpriv,cron.none /var/log/syslog works?

Sure.  Each message has a facility and a severity.

The facility indicates the type of service that the program is providing (e.g. kernel, mail, cron, authentication).  You can find a full list here.  Multiple programs that perform similar or related functions can use the same facility when logging messages.  In this case the programs ‘cron’ and ‘anacron’ both specify ‘cron’ as the facility.  Facility names were developed a long time ago — often in a context where only one program existed that provided a particular service — so many times the name of the facility is the same as the name of a program.

The severity was explained in the last blog post on this topic.

You can mix and match facilities and severities using the . (dot) notation where the facility is specified before the dot and severity is specified after the dot.  A * (asterisk) is a wildcard that matches all facilities and severities — depending on what side of the dot it appears.

Examples:

  • cron.err will only match cron messages issued with a severity level of ‘error’
  • cron.warning will match all cron warnings
  • cron.* will match all cron messages, regardless of severity
  • *.err will match all error messages, regardless of what facility they were logged under
  • *.* matches all messages from all facilities regardless of severity

Thus something like *.*  /var/log/syslog is all you need to log all messages to syslog.  It’s like a ‘catch all’.

Now, if you want to specify multiple things to be logged to the same file, you can use a semi-colon like this:  auth.alert;cron:err  /var/log/myCustomLog

If you want to treat groups of messages of a given severity level the same way (e.g. mail.err;cron.err) you can chain together the different facilities using commas like this:  mail,cron.err  /var/log/myCustomLog

Finally, there is a special type of ‘severity’ that only works inside rsyslog, and that’s ‘none’.  The ‘none’ severity simply means “don’t log it” or, perhaps more subtly, “none of these message should be included in this log”.  Thus cron.none /var/log/syslog makes sure that no cron messages will be logged in syslog.

You can combine * ; , and none all on the same line, so what *.*;auth,authpriv,cron.none /var/log/syslog actually means is:

  • log all facilities and severities
  • but log none of the messages from auth,authpriv,cron
  • to the file /var/log/syslog

Once you break it down it’s not too tricky.

Q: What does the hypen in front of the filename mean?

Astute readers probably noticed a hyphen in front of some of the filenames in /etc/rsyslog.d/50-default.conf (e.g. -/var/log/syslog).  And yes, they are hyphens, not tildes.

In the old days — when drives and systems were really slow and unreliable — it was risky to hold back (buffer) log entries in memory and then ‘sync’ them to disk all in one hit.  Sure, it improved performance, but you could lose (sometimes important) log entries if the system went down.  Thus rsyslog defaulted to immediately syncing/flushing out writes.  A user that wanted to ‘switch off’ the syncing, and buffer writes for improved performance, could prepend a log file name with a hyphen.  Thus -/var/log/syslog once meant “don’t sync writes to syslog”.  The hyphen was a ‘sync switch’.

Times changed, system reliability improved, and at some point the folks who develop rsyslog decided that the default behaviour should be changed.  Now writes are not synced by default — they are buffered.  The hyphen doesn’t actually do anything anymore — unless you add an $ActionFileEnableSync on entry to /etc/rsyslog.conf.

You can happily ignore the hyphens on any modern system.  Strip them out if you like (as I did above)  — it makes no difference because they are ignored anyway.  They are just artefacts from the past.

Advertisements
Posted in Stuff | Tagged , , , , , , , | Leave a comment

Hmm. We’re having trouble finding that site.

Having problems connecting to sites with browsers like Firefox and Chrome?  Do pages seem to hang with a “Performing TLS Handshake” message at the bottom?  Is it particularly bad if you try to open multiple tabs all at once?  Here’s one reason why that happens, and how to fix it.

I do most of my web surfing from a desktop computer, and I’ve been seeing this error message a fair bit lately:

hmmwerehavingtrouble

(That’s from Firefox.  Other browsers like Chrome report connection errors in similar ways.)

I tried the usual things (e.g. clearing caches, restarting the browser, tracing routes to the site I was trying to connect to, rebooting the computer, cycling the router, checking my ISP’s status page for outages, seeking enlightenment from the Duck) but none of that helped.

The connection failures seemed random.  They would crop up now and then, with no apparent pattern.  They weren’t isolated to any particular sites.  When I was just surfing the web normally it would happen every-so-often — maybe a few times a night.  Just enough to be annoying, but not enough to give me a clear clue as to their cause.

What I did notice is that it happened more frequently when I opened multiple tabs at once.  I have over 20 news sites in a single bookmark folder, and CTRL-clicking the folder opens all the tabs at the same time.  It was pretty-much certain that at least one site would fail to load properly when I did that.

Even if a site failed to load, pressing F5 (a simple reload) would always load it successfully on the second attempt.  Weird.

Further, I use KVM on Ubuntu to develop software in virtual machines, and I noticed odd network behaviour in there as well.  Behaviour that, well, just seemed out of place.  Linux servers tend to be rock solid, and their networking tends to be rock solid.  You get used to the reliability and performance.  So when the networking starts misbehaving — even a little — you tend to notice.

Scanning /var/log/syslog I noticed an unusually high frequency of the following types of message:

systemd-resolved : Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
kernel : enp37s0 NIC Link is Down

A DNS lookup failing immediately followed by the network card being reported as down (and subsequently brought back up).  WTF?  Why is my Ethernet connection going down?

I tried the usual things (e.g. check and reconfigure NetworkManager, netplan, networkd, switch the Ethernet cable to a different port on the router, try a different cable) all to no avail.

Following the process of elimination, I eventually ended up logged into the router — a FritzBox 7490:

fritzbox7490

Under Home Network » Network » Network Settings » LAN Settings I had the following:

fritzboxlansettings

Yep, exactly as I wanted it.  Since my ADSL Internet connection is the bandwidth bottleneck, I prefer to leave LAN ports at 100Mbps most of the time to reduce power consumption and generate less heat.  It’s only when I need to sling around huge amounts of data on the LAN that I switch the ports to 1Gbps.

Since I’d exhausted pretty-much all other options, I switched the ports from “Green Mode” to “Power Mode”, then opened up the 20 tabs in my News folder…  Nothing.  No connection failures.  Hmmm…

I restarted the browser and tested again.  No connection failures.
I rebooted the computer and tested again.  No connection failures.
I surfed for days.  No connection failures.

Woo-hoo!

The “Green Mode” for LAN ports on a FritzBox 7490 doesn’t just lower the connection speed to 100Mbps — it puts the Ethernet port into a low-power standby mode that effectively drops the connection at the other (computer) end within (tens of) seconds of the link going idle.  TLS handshakes weren’t failing because there was something wrong with the websites I was trying to connect to — they failed because the link between my computer and the router was down!

Now, I have had consumer electronics annoy the crap out of me in the past with their over-zealous “energy saving features”, and AVM (the manufacturer of the FritzBox) is based out of Germany, so I suspect that the root cause of this might be some sort of poorly-conceived EU Directive mandating extremely low power consumption modes.  Regardless, the workaround ended up being trivial, rock-solid networking has been restored, and there are no more connection failures when web surfing — so I’m back to being a happy camper.  🙂

TL;DR:  If you are experiencing web connection errors, and your computer/laptop is connected via Ethernet to your router, check the settings on your router for any “energy saving features” that might possibly be causing your ports to go into “standby mode”.  If you are connecting via Wi-Fi then a “low power” transmitter setting might cause similar sorts of issues, and cranking up the power may help solve them.

I hope that ends up being useful to someone.  Take it easy.

Posted in Stuff | Tagged , , , , , , , , , , , , , , , | Leave a comment

Suppressing messages in /var/log/syslog

Running Ubuntu or some other Linux distro?  Seeing lots of pointless messages in /var/log/syslog?  Want to be able to stop/suppress/filter the entries from a particular program so that they don’t obscure the more important ones?  I’ll show you how.

Lots of Linux users peek inside of the logs that accumulate in /var/log/ every-so-often to keep an aye on what’s going on under the hood.  Most of the time things trundle along quite nicely, but occasionally some piece of software will behave badly and fill up your logs with all sorts of pointless garbage.

That happened to me today with Steam.  I had a look inside of /var/log/syslog and saw over 1,500 entries like this (generated today alone):

Jan  1 10:18:48 SystemName steam.desktop[1390]: followed by...

[INFO:crash_reporting.cc(216)] Crash reporting enabled for process: browser
[WARNING:crash_reporting.cc(255)] Failed to set crash key: UserID with value: 0
[WARNING:crash_reporting.cc(255)] Failed to set crash key: BuildID with value: 1543263366
[WARNING:crash_reporting.cc(255)] Failed to set crash key: SteamUniverse with value: Public
[WARNING:crash_reporting.cc(255)] Failed to set crash key: Vendor with value: Valve
[ERROR:gpu_process_transport_factory.cc(1026)] Lost UI shared context.
[INFO:crash_reporting.cc(239)] Crash reporting enabled for process: renderer
[INFO:crash_reporting.cc(216)] Crash reporting enabled for process: gpu-process
Startup - updater built Nov 26 2018 20:15:21
Verification complete
Verifying installation...
Background update loop checking for update. . .
Checking for available updates...
Downloading manifest: client-download.steampowered.com/client/steam_client_ubuntu12
Download skipped: /client/steam_client_ubuntu12 version 1543346820, installed version 1543346820
Nothing to do
Shutdown
Checking for update on startup
Performing checksum verification of executable files
CAPIJobRequestUserStats - Server response failed 2
CAppInfoCacheReadFromDiskThread took 43 milliseconds to initialize
CApplicationManagerPopulateThread took 6 milliseconds to initialize (will have waited on CAppInfoCacheReadFromDiskThread)
ExecCommandLine: "'/home/tim/.local/share/Steam/ubuntu12_32/steam'"
ExecuteSteamURL: "steam://openurl/https://steamcommunity.com/my/gamecards/991980/"
Exiting workitem thread
Failed to init SteamVR because it isn't installed
Fontconfig error: "/etc/fonts/conf.d/10-scale-bitmap-fonts.conf", line 72: non-double matrix element
Fontconfig warning: "/etc/fonts/conf.d/10-scale-bitmap-fonts.conf", line 80: saw unknown, expected number
Gtk-Message: Failed to load module "atk-bridge"
Gtk-Message: Failed to load module "gail"
Installing breakpad exception handler for appid(steam)/version(1543346820)
JS method call Broadcast.RegisterForBroadcastStatus with 1 arguments
JS method call Broadcast.RegisterForViewerRequests with 1 arguments
JS method call FriendSettings.GetEnabledFeatures with 1 arguments
JS method call FriendSettings.RegisterForSettingsChanges with 1 arguments
JS method call Parental.RegisterForParentalSettingsChanges with 1 arguments
JS method call SharedConnection.AllocateSharedConnection with 1 arguments
JS method call SharedConnection.RegisterOnLogonInfoChanged with 2 arguments
JS method call SharedConnection.RegisterOnMessageReceived with 2 arguments
JS method call SharedConnection.SendMsgAndAwaitResponse with 3 arguments
JS method call SharedConnection.SendMsg with 2 arguments
JS method call SharedConnection.SubscribeToClientServiceMethod with 2 arguments
JS method call SharedConnection.SubscribeToEMsg with 2 arguments
JS method call Storage.GetString with 2 arguments
JS method call WebChat.BSuppressPopupsInRestore with 1 arguments
JS method call WebChat.GetCurrentUserAccountID with 1 arguments
JS method call WebChat.GetOverlayChatBrowserInfo with 1 arguments
JS method call WebChat.GetPushToTalkEnabled with 1 arguments
JS method call WebChat.GetSignIntoFriendsOnStart with 1 arguments
JS method call WebChat.GetWebChatLanguage with 1 arguments
JS method call WebChat.GetWebChatURL with 1 arguments
JS method call WebChat.RegisterForComputerActiveStateChange with 1 arguments
JS method call WebChat.RegisterForFriendPostMessage with 1 arguments
JS method call WebChat.RegisterForPushToTalkStateChange with 1 arguments
JS method call WebChat.RegisterOverlayChatBrowserInfoChanged with 1 arguments
JS method call WebChat.SetVoiceChatActive with 1 arguments
message repeated 25 times: [ JS method call SharedConnection.SubscribeToClientServiceMethod with 2 arguments]
message repeated 3 times: [ JS method call SharedConnection.SendMsgAndAwaitResponse with 3 arguments]
message repeated 4 times: [ JS method call Storage.GetString with 2 arguments]
message repeated 7 times: [ JS method call SharedConnection.SubscribeToEMsg with 2 arguments]
message repeated 8 times: [ Installing breakpad exception handler for appid(steam)/version(1543346820)]
migrating temporary roaming config store
Opted-in Controller Mask for AppId 0: 0
Pins up-to-date!
roaming config store loaded successfully - 2357 bytes.
Running Steam on ubuntu 18.04 64-bit
(steam:1678): Gtk-WARNING **: gtk_disable_setlocale() must be called before gtk_init()
** (steam:1678): WARNING **: Could not create object for /org/freedesktop/NetworkManager/Devices/1: unknown object type
** (steam:1678): WARNING **: Ignoring invalid property 'address-data'
** (steam:1678): WARNING **: Ignoring invalid property 'addr-gen-mode'
** (steam:1678): WARNING **: Ignoring invalid property 'autoconnect-priority'
** (steam:1678): WARNING **: Ignoring invalid property 'interface-name'
** (steam:1678): WARNING **: Ignoring invalid property 'route-data'
** (steam:1678): WARNING **: Ignoring invalid property 'wake-on-lan'
** (steam:1678): WARNING **: Unknown device type 14
** (steam:1678): WARNING **: Unknown setting 'proxy'
STEAM_RUNTIME is enabled automatically
System startup time: 4.88 seconds

(Duplicates removed to save space.)

I mean, seriously???  Thousands of lines of ‘chatty’ crap that should never have been logged in the first place, or which I can’t fix or do anything about anyway.  /sigh

Now, Steam isn’t the only program behaving badly, but it was the one that caught my eye, and it was the worst offender, so I decided to deal with it first.

Now, I’m using Ubuntu 18.04 and that uses rsyslog to do the logging.  Nice program, but the documentation is definitely not friendly for newbies.  It’s also aimed more at sysadmins who want to manage their logs in a different way.

If you just want to suppress the messages being generated by a specific program, then this is what you need to do:

First, use your favourite editor to edit rsyslog’s configuration file:

$ sudo nano /etc/rsyslog.d/50-default.conf

The top of the file will start with a bunch of commented lines.  Comments all start with a #.  My first uncommented line contained auth,authpriv.* /var/log/auth.log.

Add the following “expression-based filter” somewhere above/before your first uncommented line:

if ($programname == 'steam.desktop') then stop

It shouldn’t take a rocket scientist to work out what’s going on.

  • $programname is a predefined message property that holds the name of the program.
  • steam.desktop is the name of the program as it appears in /var/log/syslog entries like this:
    • Jan  1 10:18:48 SystemName steam.desktop[1390]: Running Steam on ubuntu 18.04 64-bit
  • stop directs rsyslog to take no further action with this message (i.e. it silently drops it and will not log it to a file).

Change the name of the program to be whatever you want to suppress messages from, and then save and exit the editor.  (CTRL-X followed by Y followed by ENTER will do that for users new to nano.)

For the filter to take effect:

$ sudo service rsyslog restart

That’s it, you’re done!  You’ll logs will never be sullied by messages from that application again.  Enjoy the higher signal-to-noise ratio!

But wait…

Q:  What if I only want to get rid of the ‘less important’ messages, and still receive all the ‘more serious’ ones?

A legitimate question.  The answer relies on the developer of the program setting an appropriate ‘severity’ to their messages.  That’s something you don’t have control over, so the following may not work, but you can try it anyway:

if ($programname == 'steam.desktop' and $syslogseverity > 5) then stop

By also testing the $syslogseverity property of the message you might be able to limit the messages you suppress to just the chatty/pointless ones.

The different severity levels, and what they correspond to, are as follows:

0 — emergencies — System unusable
1 — alerts — Immediate action required
2 — critical — Critical condition
3 — errors — Error conditions
4 — warnings — Warning conditions
5 — notifications — Normal but significant conditions
6 — informational — Informational messages
7 — debugging — Debugging messages

So — theoretically — testing for $syslogseverity > 5 should silently drop all informational and debugging messages, but let notifications, warnings and so-on through.  Theoretically.

Q:  What if I want to suppress messages based on something other than the $programname?

Yep, you can do that.  A list of the various properties you can easily access is found here:

https://www.rsyslog.com/doc/v8-stable/configuration/properties.html

Expression-based filters are great in that they give you the freedom to be as arbitrarily complex as you like, and they should be familiar to anyone with any programming experience at all.  But remember what I said about the documentation not being newbie-friendly?  Yeah.  Brace yourself and dive into RainerScript.

Anyway, I’m outta here.  Happy New Year!

Posted in Stuff | Tagged , , , , , , , | 1 Comment

The ‘best’ way to brace a wooden door or gate

Got a sagging wooden door or gate?  Building a new one?  Not sure how you should brace it?  Confused about the conflicting advice you may have read elsewhere?  Read on.

Wooden doors and gates come in many different shapes, sizes and styles.  A very simple one would look something like this:

DoorNew

Unfortunately, if you build it that way, it will soon end up looking more like this:

DoorSagging

The weight of the door/gate has transferred to a relatively small number of nails/screws that hold the wooden boards to the horizontal rails, crushed the adjacent wood fibres, and now the door/gate is sagging.  Your door/gate may already look like that.

To repair or prevent sagging requires that the door/gate be braced.  A brace is a diagonal piece of wood that takes a bit of the load off the nails/screws.

Tension Braces

Many doors/gates have tension braces.  They look like this:

DoorTensionBrace

A tension brace works by transferring load from the outside end of the bottom rail to the inside (hinge) end of the top rail.  It ‘pulls’ the weight of the gate up to the top hinge.

Compression Braces

The other — and now far more common — type of brace is the compression brace.  They look like this:

DoorCompressionBrace

A compression brace works by transferring the load from the outside end of the top rail to the inside (hinge) end of the bottom rail.  The weight of the gate ‘rests’ more on the bottom hinge.  Load is transferred something like this:

DoorCompressionBraceLoads

So, which is better?  How should you brace your door/gate?

Tension vs Compression

Let’s kick off by just clarifying one thing:  Any type of bracing is better than no bracing at all.  Pretty-much all doors/gates require some form of bracing, but depending on size, shape and weight, you can sometimes get away with very little bracing and the type of bracing doesn’t really matter.  Narrow and/or light doors/gates fall into the ‘it probably doesn’t matter’ category.

If your door is wide and/or heavy, however, then you are better off with a compression brace.  A compression brace is easy for most DIYers to construct to a satisfactory standard, and is hard to screw up.  Just remember:

  • Compression braces are suitable for doors where the angle between the bottom rail and the brace is greater than 45°
  • Make sure both ends of the brace have full contact with the rails
  • Put (ideally two) nails/screws through each board into the brace

Note:  It is entirely possible to tension brace a wide and heavy gate — if you have a decent amount of experience with different types of joints and you know what you are doing.  The average person, however, doesn’t have that experience, and doesn’t have the required carpentry skills…

The biggest advantage that compression braces have over tension braces is the way that loads are transferred.  In compression, loads are distributed over the end cross-section of of the brace as well as the screws/nails — that greatly reduces the overall rate of fibre compression (which means it will sag less and last longer).  Because the loads are transferred to the bottom hinge, and the bottom hinge is closer to the ground, the gate ends up more stable — it bounces up and down less.  Gate posts supporting tension-braced gates also tend to bend/twist over time — because the load is transferred to the top hinge (higher from the ground, where it exerts more torque).

Why The Contradictions?

If compression braces have such a clear advantage, why do some sources still recommend tension braces?

Mainly due to historical reasons, but also because in a very limited number of scenarios tension is still better.

In the good old days, farmers and homesteaders would cut their own timber/lumber with axes and saws.  The wood would be ‘green’.  Green lumber (i.e. freshly cut or anything that hasn’t been kiln-dried or seasoned for a couple of years) contains a large amount of water. As it dries it shrinks. The shrinkage will occur in all dimensions, but most noticeably along its length. The lumber will get shorter. If you are using green lumber, a tension brace is ideal because the shrinking wood will help pull the bottom rail up.

Since nearly all lumber used nowadays is purchased from a store, and has been seasoned/kiln-dried, it contains very little water. If you live in a climate with a wet season, the lumber will absorb water and expand. It will get longer. If you build using dry lumber, a compression brace is ideal because the expanding wood will push the top rail up.

In the good old days, steel wasn’t used for gate posts.  Timber posts were sunk into bare soil.  Since most gates were outside and exposed to the elements, rain would hit the gate from both sides.  A compression brace would channel water that hits the ‘back’ of the gate down towards the gate post — where it would saturate the soil, encourage rot, and result in the premature failure of the post.  A tension brace would channel water away from the gate post — helping to keep the soil there dry and prolonging the life of the post.

Modern gate builders often use steel gate posts sunk into concrete, and don’t expect their gates to last a generation, so don’t care where the water goes when it runs off the gate.  It’s not even a consideration.  If the brace is out of the weather (e.g. the inside of a barn, shed or house door) then it’s also not an issue.

Summary

When would a tension brace be acceptable, or even advisable?

  • Narrow doors/gates
  • Light doors/gates
  • When using green lumber, which includes:
    • lumber you have sawn/milled yourself
    • (store-bought or reclaimed) lumber that has been exposed to rain
  • In exposed situations where the gate post is subject to rot
  • If you intend to always keep the door/gate painted/oiled (the brace won’t lengthen during the wet season because water won’t get absorbed in any appreciable amount)
  • If you want — or are prepared — to use a turnbuckle brace, then tension is the best/only way to go

In all other situations, the average DIYer would be better off installing a compression brace.  If you’re not sure, install a compression brace.

Anyway, I hope you found some of that interesting or useful.  Cheerio!

Posted in Stuff | Tagged , , , , , | 1 Comment

Passively-cooled CPU Thermals (Part 2)

In the first Passively-cooled CPU Thermals post I looked at how the completely fanless Streacom DB4 I built performed under various CPU loads.  A question over on Silent PC Review asked how the thermal curve looked when the CPU was unloaded.  Good question — let’s find out!

It seems logical that a passively-cooled computer that uses heat pipes and large slabs of aluminium will heat up and cool down differently than an air-cooled system with a (relatively) small fin stack.

When I performed the previous round of tests, I ended up with graphs like this:

Thermals 60m at 100%

Now, one thing that niggled away at the back of my mind was how the temperature seemed to spike at the start and then flatten off relatively quickly.  Given that I was sampling the temperature sensors at 5s intervals, I wasn’t sure if the sudden change in angle was just due to a lack of resolution, or because of something else.  Hmmm…

Anyway, the SPCR question gave me an excuse to run another test but instead of putting the CPU under load and heating the system up, this time I’d be unloading the CPU and cooling the system down.  I wondered if/how that would be different.

Test 5 — CPU unloaded from 100% at an ambient temperature of 20°C

The Ryzen 5 1600 was first placed under a 100% load (all 12 threads pegged at 100%) for about an hour until an equilibrium temperature was reached (60⁰C).  The load was then removed and the CPU temperature recorded:

Thermals 60m cooldownMy previous set of results used wider graphs, but when websites scale them
the text ends up a bit small and blurry, so this time I made the graph narrower.
Otherwise the testing and recording setup was identical.

The CPU cooled down from 60⁰C to 49⁰C almost instantly, and then gradually made its way down to 34⁰C over the course of an hour.  It would have cooled down a couple of degrees more, but I wasn’t prepared to wait — that sudden change in angle was even more pronounced and needed investigating.

Obviously 1 hour graphs sampled every 5s were too coarse to shed a whole lot of light on what was happening in those first few seconds, so I had to run some more granular tests.  I figured 1 minute sampled every 1s would do the trick.

Test 6 — CPU loaded to 100% for 60s at an ambient temperature of 20⁰C

First 60s heating up

Test 7 — CPU unloaded from 100% for 60s at an ambient temperature of 20⁰C

First 60s cooling down

Nope, it wasn’t a figment of my imagination.  The sudden change in angle is definitely there — both when heating up and cooling down.

Let’s put the results ‘side-by-side’:

First 60s up and down

What we’re seeing are two different response curves.  When the CPU is initially (un)loaded the temperature changes at ±1⁰C every 1s for about 8–9s.  It then flattens out dramatically and changes by ±1⁰C every 16–24s after that.  Huge difference.

Changes in thermal conductivity can usually be explained by changes in the material or media being used.  In this case we have a CPU, soldered to a copper IHS, plated in nickel, covered by a thin layer of TIM, covered by a copper shim, covered by TIM, clamped against an aluminium block with exposed copper heat pipes.

This image (from Streacom’s DB4 Manual) might make the stacking a bit clearer:

CPU Cooling Stack

If we ignore the thin layers of TIM, it’s pretty-much metal all the way through to the heat pipes before we reach the first medium that could possibly have a 16–24x lower thermal conductivity — the water inside the heat pipes themselves.

I suspect that the thermal mass between the heat pipes and the CPU can buffer about 8–9s worth of heat output, and that — due to high thermal conductivity — this mass heats up and cools down rather quickly.  The water in the heat pipes, however, needs to undergo a phase change (from liquid to vapour), travel to the end of the heat pipe, undergo another phase change (from vapour to liquid), and then flow back — a process that could easily be an order of magnitude slower at transferring heat.

Interesting stuff.

At the end of the day it doesn’t matter exactly how thermally conductive your cooling system is — as long as it’s conductive enough to get rid of all of the heat your CPU is producing without impacting performance (i.e. thermally throttling), it’s fine.

The previous tests (and now weeks of daily hammering) show that the DB4+LH6 combination is easily capable of cooling a stock Ryzen 5 1600 — no matter how heavily you load it or how long your jobs run for.  You can run a Ryzen 5 1600 at 100% load all day, every day, and it won’t even break a sweat.

AMmmD and Streacommm — I’m lovin’ it!  😉

Posted in Silent Computer | Tagged , , , , , , , , , , , | 15 Comments

Passively-cooled CPU Thermals

The Completely Silent Computer I built relies entirely on passive cooling.  Some folks are keen to know how well (or poorly) it works in the real world.  Can it adequately cool a loaded system, or does thermal throttling make it pointless?  Well, it’s time to find out.

DB4n

As can be seen in these images, the Ryzen 5 1600 that I installed in the DB4 is cooled by heat pipes that transfer heat from the block on the CPU to spreaders attached to the aluminium walls of the case.  Since I installed the optional LH6 Cooling Kit, there are a total of six heat pipes that connect to three spreaders on two walls of the case.

DB4o

The walls are 13mm-thick extruded aluminium plates.  They weigh a lot.  Heat conducts from the spreaders inside to the exterior surface.  Grooves that run the full height of the exterior walls provide a huge surface area that allows heat to be transferred to air that then flows up and out the top of the plates.

DB4p

Vents have been machined into the inside of each plate — one towards the bottom and one towards the top — which allow cool air to be drawn into the case, and hot air to flow out.

DB4q

That’s about it.  Hot air rises due to buoyancy, flows out of the case, creating a negative pressure zone inside the case, which then sucks cool air in through the bottom and sides of the case.  Rinse.  Repeat.  Doesn’t get much simpler.  Totally passive.

Enough ‘theory’ — let’s put this thing to the test!

I searched around a bit and ended up installing a nice little program called Psensor because it would let me produce some clear graphs of how system load affects CPU and GPU temperatures over time.

Note:  This post only deals with CPU thermals.  GPU thermals will come later.

In a passively-cooled system, instantaneous temperatures aren’t actually that useful.  The walls of the case will initially soak up a lot of heat and then radiate/convect some of that heat back into the case, which will heat up the internal components.  What that means is that it takes a lot longer to reach an equilibrium temperature in a passively-cooled system than it does in an air-cooled system (or even a water-cooled system).

To establish a baseline, I monitored CPU temperatures from the moment the computer was turned on, and let it idle for a couple of hours.

Test 0 — CPU idling at an ambient temperature of 20⁰C

Thermals 120m at IDLE

At idle, the 3.2GHz (stock base clock) Ryzen 5 1600 (with 6 cores and 12 threads) — housed in the completely passively-cooled DB4 — reached a temperature of 31⁰C.  That’s 11⁰C above ambient.

If I’d left the test running for a couple more hours the CPU temperature probably would have gotten 1⁰C warmer, but my house was heating up at the same time and the increase in ambient temperature would have impacted on the results — so I didn’t bother extending the test.  Let’s accept 12⁰C above ambient.

Having established a baseline, I then decided to give the CPU a variety of fixed workloads and let them run for as long as it took for the CPU temperature to stabilise.  That ended up being about an hour.

Test 1 — CPU loaded to 25% at an ambient temperature of 21⁰C

Thermals 60m at 25%

At 25% load, the 3.2GHz Ryzen 5 1600  — housed in the passively-cooled DB4 — reached an equilibrium temperature of 47⁰C.  That’s 26⁰C above ambient.  No thermal throttling occurred.

The hottest part of the hottest exterior wall was 37⁰C (as measured with an IR thermometer).  Very warm to the touch, but nothing at all to be worried about — cats would love it.

Test 2 — CPU loaded to 50% at an ambient temperature of 20⁰C

Thermals 60m at 50%

At 50% load, the Ryzen 5 1600 — housed in the passively-cooled DB4 — reached an equilibrium temperature of 52⁰C.  That’s 32⁰C above ambient.  No thermal throttling occurred.

The hottest part of the hottest exterior wall was 40⁰C.  Toasty — borderline hot even — but I was able to press my hand firmly against it for as long as I liked without feeling any discomfort.

Test 3 — CPU loaded to 75% at an ambient temperature of 22⁰C

Thermals 60m at 75%

At 75% load, the Ryzen 5 1600 — housed in the DB4 — reached an equilibrium temperature of 58⁰C.  That’s 36⁰C above ambient.  No thermal throttling occurred.

The hottest part of the hottest exterior wall was 42⁰C.  Hot, and if I pressed my hand firmly against it for more than about 10s it became uncomfortable.

Test 4 — CPU loaded to 100% at an ambient temperature of 22⁰C

Thermals 60m at 100%

At 100% load, the Ryzen 5 1600 in the DB4 reached an equilibrium temperature of 60⁰C.  That’s 38⁰C above ambient.  No thermal throttling occurred.

The hottest part of the hottest exterior wall remained 42⁰C.  Hot to touch, and uncomfortable after a while, but not painful.

So, the first set of real-world results are in!

If we normalise the results for an ambient temperature of 20⁰C then we have the following equilibrium temperatures for a stock 3.2GHz Ryzen 5 1600 in a Streacom DB4 with the optional LH6 Cooling Kit installed:

  • Idle (0%):  32⁰C
  • 25% load:  46⁰C
  • 50% load:  52⁰C
  • 75% load:  56⁰C
  • 100% load:  58⁰C

For the folks that can’t get enough charts:

Thermal Results 1

Analysis

So, what do these results tell us?

  1. The most obvious thing is that either the Ryzen 5 1600 is a very efficient CPU, the DB4+LH6 is a very effective cooling combination, or both!  To max out at 58⁰C under full load is excellent for a completely passively-cooled system — much better that what I was expecting.
  2. There will clearly be no problem running compute-heavy overnight jobs on this system.  A 100% load for 4–6 hours at a time should be a walk in the park.
  3. Ryzen CPUs thermally throttle north of 90⁰C and shut down at 95⁰C so there is ample room for overclocking the Ryzen 5 1600 to 1600X levels, upgrading to the new Ryzen 5 2600 (and moderately overclocking that), or even stepping up to something like a Ryzen 7 2700 (although I doubt you’d get more than a light overclock out of that).
  4. If a Ryzen 5 1600 or 2600 gives you as much performance as you need (and the 1600 does for me), then I’m confident you could get away with just the stock cooling solution that ships with the DB4 — no need to get the optional LH6 Cooling Kit like I did.  Four heat pipes and a single spreader would provide adequate cooling, but your CPU temps would be noticeably higher (probably in the high 60s or low 70s).
  5. Exterior wall temperatures can get hot, and even uncomfortable after extended contact, but never painful and certainly not dangerous.  No need to worry about the safety of pets or small children (if you have them) — at least not with a stock Ryzen 5 1600 (or 2600).  If you overclock or put in a Ryzen 7 2700 then you’d want to monitor the exterior temperatures yourself to see if they ever climb to hazardous levels.

Changes

With the benefit of this real-world testing and hindsight, would I do anything differently if I was building a passively-cooled DB4 system today?  Yep, sure would.

  • I’d get a Ryzen 5 2600 — a little bit more performance for about the same heat.
  • I wouldn’t bother buying the optional LH6 Cooling Kit — it would be overkill for even a moderately-overclocked 2600.
  • Not having to fit the longer LH6 heat pipes would mean that pretty-much all of the component-clearance issues would disappear, and I’d have many more motherboards to choose from.  I’d pick one with two NVMe M.2 slots to cut down on cable clutter.

Apart from that, everything else would stay the same.

Future plans

  • Conduct some load tests on the GPU in isolation.
  • Torture-test the DB4 by doing things like:
    • Loading both the CPU and GPU to 100% at the same time.
      • Not very realistic, but a ‘worst-case’ scenario that deserves to be tested — For Science!
    • Blocking vents while the system is under load.

If you have any questions about the results or methodology above, or you’d like a certain type of thermal test performed, just ask in the comments.

Auf Wiedersehen!

Posted in Silent Computer | Tagged , , , , , , , , , , , , | 9 Comments

Does Pinnacle Ridge change anything?

With AMD’s 2nd Generation Ryzen processors now released and tested, I figured it was worth reviewing the Completely Silent Computer article and seeing if I would have done anything differently if I were building the system today.  The short answer is ‘yes’.

In 2017 AMD released the 1st Generation of Ryzen CPUs with the codename “Summit Ridge”, based on the 14nm Zen microarchitecture:

DB4f

  • Ryzen 3 1200, 1300X
  • Ryzen 5 1400, 1500X, 1600, 1600X
  • Ryzen 7 1700, 1700X, 1800X

I chose a Ryzen 5 1600 for my system because it has adequate performance for my needs, good performance per Watt, and runs quite cool (65W TDP).

In 2018-02 AMD released the 1st Generation of Ryzen APUs with the codename “Raven Ridge”, also based on the 14mm Zen microarchitecture:

  • Ryzen 3 2200G
  • Ryzen 5 2400G

Just recently, in 2018-04, AMD released the 2nd Generation of Ryzen CPUs with the codename “Pinnacle Ridge”, based on the 12mm Zen+ microarchitecture:

  • Ryzen 5 2600, 2600X
  • Ryzen 7 2700, 2700X

amd-ryzen-2000-1

The Ryzen 5 2600 is the natural successor to the 1600 and delivers a minor (~8%) performance boost whilst still staying within the 65W TDP.

Like the 1600, the 2600 has its Integrated Heat Spreader (IHS) soldered onto the die, so the thermal conductivity will not degrade over time as it does with Thermal Interface Material (TIM, which is basically thermal paste) — a good thing if you plan on keeping your CPU for a long time.

The 2600 offers a little bit of upside, with no downside, so yeah — if I were building the system today a 2600 would definitely have gone in.

That’s about the only change I would have made.

 

PS:  Selecting components and upgrades is always a process of elimination.  For those that are curious, here’s some of the thinking that goes on ‘behind the scenes’…

Both Summit Ridge and Pinnacle Ridge Ryzen 7s are overkill for my needs:

  • This computer simply doesn’t need eight cores — I don’t generate enough work to keep them all busy.
  • Idle cores still consume power and produce heat — both of which are undesirable in an energy-efficient and passively-cooled system with a relatively small (240W) PSU.
  • A (hypothetical) R7 2700X + GTX 1050 Ti GPU combination could draw as much as (105+75=) 180W of power from a 12V rail that can only supply 168W (14A) max… so would not even be electrically stable.

The Raven Ridge APUs are interesting but had/have issues:

  • They didn’t exist when I purchased my system.
  • Linux support for APU graphics was likely to be patchy in the first few months — something I just didn’t want to have to deal with.
  • They use TIM between the IHS and the die — not solder.
    • They won’t cool as easily to start with.
    • Cooling will actually get worse over time as the TIM degrades — resulting in higher temps, thermal throttling, and possibly even a shortened lifespan.
  • Their GPU performance is on-par with a GT 1030.  I was rocking a GTX 1060 prior to the Intel Meltdown/Spectre debacle and — based on that reference point — predicted that even R5 2400G graphics performance (Vega 11) would be inadequate for my needs.
  • With 4 Cores and 4 Threads the R3 2200G simply isn’t powerful enough, but with 4 Cores and 8 Threads the R5 2400G is borderline acceptable (SMT matters for the computation I do).

I’m confident that eventually AMD’s APUs will offer the performance I need for this system.  I like the idea of having an APU in the DB4 — I really do — but it needs to be performant.  Currently they aren’t capable of replacing my DGPU.  It will be interesting to see how things develop on this front.

Posted in Silent Computer | Tagged , , , , , , , , , , , | 8 Comments