Rundeck is great for scheduling and monitoring all those annoying little tasks that keep everything running cleanly and smoothly, however I find that it quickly generates a lot of small log files that begin to take up valuable disk space.
One way to deal with this is to write a script that clears out log files that are over a specific age. The drawback of doing this is that the executions remain in the Rundeck database and the link to the relevant log file becomes obsolete.
I’ve written a python script that uses the Rundeck API to query all executions and delete any that are too old. This leaves the Rundeck database clean and also removes the log files for the executions.
Download
The script, along with a sample configuration file, is available on Github. You can download or fork from this page.
Configuration
The sample configuration file, properties.json, is
{ "RUNDECKSERVER": "rundeck server name or IP address", "PORT": 4440, "API_KEY": "API Key with correct privileges", "PAGE_SIZE": 1000, "MAXIMUM_DAYS": 90, "TIMEOUT": 60, "DELETE_TIMEOUT": 1200, "VERBOSE": false }
Key | Value |
---|---|
RUNDECKSERVER | must be replaced by the name or IP address of your Rundeck server |
API_KEY | must be replaced by a valid API key with admin access to your server. Such an API key can be generated from the Admin->Profile menu item of the Rundeck GUI |
MAXIMUM_DAYS | sets the age of the Rundeck executions that should be deleted |
PAGE_SIZE | is the number of executions that are retrieved at one time. If this number is too low then the cleanup will take a long time to complete, if it is too high your Rundeck server might crash due to lack of memory. See below. |
TIMEOUT | the number of seconds to wait for a single API call. 60s should be OK. |
DELETE_TIMEOUT | Execution deletes are processed in bulk and can take a long time to complete. This API call has a separate timeout that should be longer than the standard one. 1200s works well for me. |
VERBOSE | Set this to true to get a more detailed log from the script. Useful for debugging, but probably not for everyday use |
Method
The script scans for each project on the Rundeck server and then scans each project in turn for jobs. The executions for each job are then queried. When I first wrote this script I attempted to get all the executions in one API call. I found that on my, relatively lowly, Rundeck servers this tended to run the JavaVM out of memory causing the entire Rundeck installation to crash (!). The script now queries the execution using the API paging functionality, the PAGE_SIZE setting gives the number of executions that are retrieved in a single API call. I’ve found that 1000 works well for my servers, getting the job done relatively quickly without crashing the VM.
Once each execution is retrieved the run date is checked against the MAXIMUM_DAYS setting. All executions that are older than this setting are deleted in a single API call. This call can take a while, hence the separate DELETE_TIMEOUT setting, but this doesn’t seem to be an issue.
The script outputs a basic log of what it is doing so that you can track its progress.
Execution
python RundeckLogfileCleanup.py
Will start the script and look for a properties.json file in the current directory. Otherwise,
python RundeckLogfileCleanup.py server1_props.json
Will start the script with the specified setting file. This is useful if you have a number of Rundeck servers that you need to clean up.
Versions
The script has been written and tested on Rundeck v2.4.0-1 americano indigo briefcase. I known that there are issues running on v2.2 as the return value from the jobs API query was slightly altered.
I’m using python 2.7 and you’ll need the excellent requests module installed.