Cleanup Azure Diagnostic data from Table Storage

Azure Diagnostics for Virtual Machines is a  great feature. You have visibility for syslog, CPU usage, memory usage, boot diagnostics, etc. It is great to have it. But if you have used it for a while, you may find yourself in the uncomfortable place to have collected too much data, that have launched storage costs to undesirable heights. At least at this point in time, Microsoft does not offer retention mechanisms for diagnostic data. One quick and dirty way to get rid of it is to completely wipe the diagnostics data and start from scratch. There is no way to throw away just the old data. It is all or nothing.

Searching on the web I did not find anything that could solve my problem out of the box. This is why I decided to write my own script for this job. And of course, to share it with you.

First, where is this information stored? When you setup Diagnostics for an Azure VM you are asked to provide a Storage Account. If you do not want to use an existing one, the wizard will help you create a new one. The collected data are stored inside the Tables section of this Storage Account. Based on the Diagnostic information you have asked to monitor and the Operating System of the VM, there will be created different tables. You may see some tables named something like WADMetricXXX and others like LinuxCpuVer2v0, LinuxMemoryVer2v0, LinuxDiskVer2v0, etc.

At this point, I should mention the Azure Storage Explorer. It is a tool you can either download and install on your computer or use it directly from the Azure Portal’s Storage Blade (currently in Preview). Expand the Storage Account, then Tables and you should see the available tables. There are some tables you should not edit, like the ones starting with a dollar sign ($) or the SchemasTable.

The WADMetricXXX tables have date specific names. This means that if you see really old entries in such a table, you can safely delete the whole of it. Just make sure you leave at least one table with latest data. Also keep in mind that there may be two series of them, like WADMetricPT1HPxxx and WADMetricPT1MPxxx (the first ends with 1HP and the second with 1MP). This is for metrics captured per hour, per minute, etc. So, expect the one that ends with 1MP to have many more entries.

However, deleting these tables will not completely solve your capacity problem. The rest of the tables, LinuxCpuVer2v0, LinuxMemoryVer2v0, LinuxDiskVer2v0, etc still hold a lot of old entries that consume significant storage space. To delete the entries older that N months (let’s say N=3) I have created a PowerShell script. If you know PowerShell it should be quite clear what the script does. As always, there are no warranties and you should use it at your own risk.

Import-Module AzureRM.Profile
Import-Module AzureRM.Storage
Import-Module AzureRM.Resources
Import-Module Azure.Storage
Import-Module AzureRmStorageTable

function myClear-AzCustomTableEntries {
Param ([string]$resourceGroup, [string]$storageAccount, [string]$tableName)

# Variables initialization
[int64]$oldestAllowedMonth = 2
$transactions = 0;

# Building our query
$query = New-Object Microsoft.WindowsAzure.Storage.Table.TableQuery
$query.TakeCount = 1000

# Creating our table object
$saContext = (Get-AzureRmStorageAccount -ResourceGroupName $resourceGroup -Name $storageAccount).Context
$table = Get-AzureStorageTable -Name $tableName -Context $saContext

for($i=0; $i -le 9; $i++) {
[string]$fromPKTick = ("000000000000000000" + $i + "___0" + ( (Get-Date).AddDays(-30 * $oldestAllowedMonth).Ticks) )
if( $i -ne 0 ) { [string]$toPKTick = ("000000000000000000" + ($i - 1) + "___0999999999999999999") } else { [string]$toPKTick = "0000000000000000000___0000000000000000000" }
$filter1 = [Microsoft.WindowsAzure.Storage.Table.TableQuery]::GenerateFilterCondition("PartitionKey",[Microsoft.WindowsAzure.Storage.Table.QueryComparisons]::LessThanOrEqual,$fromPKTick)
$filter2 = [Microsoft.WindowsAzure.Storage.Table.TableQuery]::GenerateFilterCondition("PartitionKey",[Microsoft.WindowsAzure.Storage.Table.QueryComparisons]::GreaterThanOrEqual,$toPKTick)
$query.FilterString = [Microsoft.WindowsAzure.Storage.Table.TableQuery]::CombineFilters($filter1, [Microsoft.WindowsAzure.Storage.Table.TableOperators]::And, $filter2)

# Start parsing for given PartitionKey
while ($true) {
# Get first 1000 entities from our table
$results = $table.CloudTable.ExecuteQuery($query)

# If we got an empty response, we should break
if($results[0].PartitionKey.Length -eq 0) { break }

# Delete each row in result if date is old enough
Foreach ($entity in $results) {
# Every 250 transactions print a message
if ( ($transactions % 250 -eq 0) -And ($transactions-ne 0) ) {
Write ("Entity PartitionKey: " + $entity.PartitionKey + ". Timestamp of last deleted entity: " + $entity.TIMESTAMP.ToString() + ". Number of transactions so far: " + $transactions )
# If we got an empty response, we should break, else we proceed with deletion of entity
if ($entity.PartitionKey.Length -eq 1) { break } else { Remove-AzureStorageTableRow -table $table -entity $entity | out-null }
# Update number of transactions
# Report total number of transactions before exiting
Write "Number of transactions in total: $transactions"

# Login to Azure
# Select our subscription
Select-AzureRmSubscription -Subscription MySubscriptionId

# Set our variables
$myResourceGroup = "mySuperRG"
$myStorageAccount = "mysuperduperdiagnosticaccnt"

# If you don't know the tables on the Storage account, either use Azure Storage Explorer tool or run the two commands below
$mySaContext = (Get-AzureRmStorageAccount -ResourceGroupName $myResourceGroup -Name $myStorageAccount).Context
Get-AzureStorageTable -Context $mySaContext | select Name | ft

# Now call our function for each table that you need to cleanup
myClear-AzCustomTableEntries -resourceGroup $myResourceGroup -storageAccount $myStorageAccount -tableName "LinuxCpuVer2v0"
myClear-AzCustomTableEntries -resourceGroup $myResourceGroup -storageAccount $myStorageAccount -tableName "LinuxDiskVer2v0"
myClear-AzCustomTableEntries -resourceGroup $myResourceGroup -storageAccount $myStorageAccount -tableName "LinuxMemoryVer2v0"

Although I hope it will get the job done for you without any modifications, you may need to play a little with some values to make the script fit your needs 100%. If you find this script useful, take the time to leave a comment to let me know how it worked for you.




One thought on “Cleanup Azure Diagnostic data from Table Storage

  1. I am glad that you simply shared this helpful syntax with us. Please stay us up to date like this.
    Thank you for sharing.

Leave a Reply

Your email address will not be published. Required fields are marked *

Follow Me