Google Cloud

2 common issues for the latest Google Cloud Logging and Cloud Monitoring Agents, and how to fix them

The Cloud Logging and Cloud Monitoring Agents (previously known as Stackdriver) are lifesavers when it comes to supporting and operating servers in Google Cloud Platform (GCP). However, there are times when deploying the Cloud Logging agent can be tricky, this article describes how you can quickly fix the 2 most common issues (in our experience) with them.

Quick note, if you haven’t been using these agents in GCP, you really should! Check out these handy links for deploying them into your VMs:

Once installed, they will send near-realtime data to your Cloud Monitoring and Cloud Logging consoles for you to get a live-feed on your instances running in GCP.

Cloud Monitoring Agent swap file errors

When deploying Cloud Monitoring Agents in Linux VM instances, you’ll sometimes see this error occur: write_gcm: can not take infinite value. This error occurs because you may not have configure a swap file for memory, however, the Cloud Monitoring Agent is programmed to look for it by default. What occurs then is that the value cannot be found, hence the repeated errors.

To resolve this issue with the Cloud Monitoring Agent, locate the following lines in the configuration file /etc/stackdriver/collectd.conf and remove them (we commonly use vim for this, but any text editor will work):

LoadPlugin swap
<Plugin "swap">
ValuesPercentage true
</Plugin>

After removing the lines above, restart the Cloud Monitoring Agent with the following command: sudo service stackdriver-agent start

Credits to https://myshittycode.com/2020/06/13/gcp-stackdriver-agent-write_gcm-can-not-take-infinite-value-error/ for the solution which has no doubt helped many people (us included!).

Cloud Logging Agent consumes large amounts of CPU and does not log data to Cloud Logging

When deploying the Cloud Logging Agent in Windows Server instances in GCP, you’ll sometimes see that the Ruby Interpreter consistently uses a large amount of CPU and does not produce any successful logging into Cloud Logging’s winevt.raw category. This issue normally occurs when you use older versions of Windows Server images.

To diagnose this issue, inspect the Cloud Logging Agent’s logs located at: C:\Program Files (x86)\Stackdriver\LoggingAgent\fluentd.log. If you spot that the Cloud Logging Agent repeatedly restarts itself after being terminated with the following lines:

[info]: #0 Initialized the insert ID key to xxxxxxxxxxxxx

[info]: #0 fluentd worker is now running worker=0

[info]: Worker 0 finished unexpectedly with signal SIGSEGV

[info]: #0 Initialized the insert ID key to xxxxxxxxxxxxx

[info]: #0 fluentd worker is now running worker=0

[info]: Worker 0 finished unexpectedly with signal SIGSEGV

This indicates that there is an issue with the gRPC plugin that is being used by the Ruby Interpreter and is described in further detail here: https://github.com/grpc/grpc/issues/7876. To resolve this issue, edit the C:\Program Files (x86)\Stackdriver\LoggingAgent\fluent.conf file with a text editor and locate the following line:

use_grpc true

And change that to:

use_grpc false

After that, restart the StackdriverLogging Service and you should be able to see the Ruby interpreter start up and gradually settle down into low CPU consumption. Remember to check your Cloud Logging logs to see if the winevt.raw category was automatically created by the Cloud Logging Agent (this may take several minutes after startup as it polls all the existing logs).

Note: According to a ticket raised by us with Google Support, this will be fixed in a future release of the agent. So stay tuned!

Further troubleshooting for the Cloud Logging and Cloud Monitoring Agents

If the steps above did not help you solve your issues, the first stop to fixing any issues is to attempt troubleshooting via these links:

If the Google Cloud articles above do not resolve your issues, reach out to us at https://www.matrixc.com/contact-us/ for assistance, we would be happy to help!

Interested in reading more Google Cloud tips? Check out our blog section at matrixc.com for more articles like this one!

Read more about Google Cloud here

Affiq Fadzil

Share
Published by
Affiq Fadzil

Recent Posts

Empowering Digital Transformation with Generative AI: Insights from Google Cloud and Matrix Connexion

Introduction: The Generative AI Revolution Generative AI is reshaping industries, offering unprecedented opportunities to improve…

3 weeks ago

Oracle’s Investment: A Major Boost for AI and Cloud Computing in Malaysia with Matrix Connexion

Oracle’s recent announcement of a monumental US$6.5 billion investment to establish a public cloud region…

4 weeks ago

Navigating the Future: Cloud Trends and Predictions with Matrix Connexion

The cloud computing industry is on the brink of transformative change as of 2024, offering…

4 weeks ago

Malaysia’s Emerging Role as a Data Center Powerhouse in Southeast Asia: The Rapid Growth of Johor Bahru and Matrix Connexion’s Contribution

Malaysia is steadily emerging as a key player in the Southeast Asian data center industry,…

4 weeks ago

How to Protect Your Business Data with Advanced Cloud Security Solutions Without Compromising Accessibility, Even If You’re Not a Security Expert

As businesses increasingly migrate to the cloud, data protection has become a top priority. For…

1 month ago