Anomaly Detection: File Create, Update & Delete Deltas¶

In this notebook we will explore the time it takes between a Create/Update file event and the accompanying Delete event.

Before running this notebook the notebook file_create_delete.ipynb should be run to generate the 'file_times.csv' file
In [12]:
from datetime import datetime
import pandas as pd
from tqdm.auto import tqdm
from sklearn.cluster import KMeans
from IPython.display import display

First we import the file_times.csv file to get the dataframe with all the Deletion, Creation & Update timestamps per TargetFilename.
We filter out the rows where UpdateTime is 0, because these rows consist of only Deletion events

In [15]:
df = pd.read_csv('../file-create-delete/file_times.csv', index_col=0)
usable = df[df['UpdateTime'] != '0'].reset_index(drop=True)
usable
Out[15]:
TargetFilename DeletionTime CreateTime UpdateTime
0 C:\Windows\ServiceProfiles\NetworkService\AppD... 2022-12-09 09:51:40.809 2022-11-09 03:01:12.045 2022-12-23 15:27:40.678
1 C:\ProgramData\regid.1991-06.com.microsoft\reg... 2022-12-09 09:51:40.984 2022-11-09 10:53:53.605 2022-12-23 17:42:06.559
2 C:\ProgramData\Microsoft\Diagnosis\DownloadedS... 2022-12-09 09:51:41.265 2022-12-01 14:34:16.181 2022-12-23 15:27:39.590
3 C:\ProgramData\Microsoft\Diagnosis\parse.dat 2022-12-09 09:51:41.307 2022-12-01 14:34:16.775 2022-12-23 15:27:39.900
4 C:\Windows\System32\LogFiles\WMI\Diagtrack-Lis... 2022-12-09 09:51:41.704 2022-12-09 09:51:29.902 2022-12-09 09:51:29.926
... ... ... ... ...
34167 C:\Users\User\AppData\Roaming\Microsoft\Window... 2022-12-23 18:39:03.490 2022-12-23 18:36:09.885 2022-12-23 18:39:39.656
34168 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.876 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709
34169 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.981 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709
34170 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.988 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709
34171 C:\Users\User\AppData\Roaming\Microsoft\Window... 2022-12-23 18:39:39.656 2022-12-23 18:36:09.885 2022-12-23 18:39:39.656

34172 rows × 4 columns

To use this dataset we need to add 2 columns, namely CreateDelta & UpdateDelta
These columns will be filled with the caluculation of the DeletionTime - CreateTime and DeletionTime - UpdateTime respectively

In [16]:
datatimeformat = "%Y-%m-%d %H:%M:%S.%f"

usable['CreateDelta'] = [0] * usable.shape[0]
usable['UpdateDelta'] = [0] * usable.shape[0]

for index, row in tqdm(usable.iterrows(), total=usable.shape[0]):
    usable.loc[index, 'CreateDelta'] = (datetime.strptime(row['DeletionTime'], datatimeformat) - datetime.strptime(row['CreateTime'], datatimeformat)).total_seconds()  
    usable.loc[index, 'UpdateDelta'] = (datetime.strptime(row['DeletionTime'], datatimeformat) - datetime.strptime(row['UpdateTime'], datatimeformat)).total_seconds()

usable
  0%|          | 0/34172 [00:00<?, ?it/s]
Out[16]:
TargetFilename DeletionTime CreateTime UpdateTime CreateDelta UpdateDelta
0 C:\Windows\ServiceProfiles\NetworkService\AppD... 2022-12-09 09:51:40.809 2022-11-09 03:01:12.045 2022-12-23 15:27:40.678 2616628.764 -1229759.869
1 C:\ProgramData\regid.1991-06.com.microsoft\reg... 2022-12-09 09:51:40.984 2022-11-09 10:53:53.605 2022-12-23 17:42:06.559 2588267.379 -1237825.575
2 C:\ProgramData\Microsoft\Diagnosis\DownloadedS... 2022-12-09 09:51:41.265 2022-12-01 14:34:16.181 2022-12-23 15:27:39.590 674245.084 -1229758.325
3 C:\ProgramData\Microsoft\Diagnosis\parse.dat 2022-12-09 09:51:41.307 2022-12-01 14:34:16.775 2022-12-23 15:27:39.900 674244.532 -1229758.593
4 C:\Windows\System32\LogFiles\WMI\Diagtrack-Lis... 2022-12-09 09:51:41.704 2022-12-09 09:51:29.902 2022-12-09 09:51:29.926 11.802 11.778
... ... ... ... ... ... ...
34167 C:\Users\User\AppData\Roaming\Microsoft\Window... 2022-12-23 18:39:03.490 2022-12-23 18:36:09.885 2022-12-23 18:39:39.656 173.605 -36.166
34168 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.876 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709 1.167 1.167
34169 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.981 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709 1.272 1.272
34170 C:\Windows\SERVIC~1\LOCALS~1\AppData\Local\Tem... 2022-12-23 18:39:18.988 2022-12-23 18:39:17.709 2022-12-23 18:39:17.709 1.279 1.279
34171 C:\Users\User\AppData\Roaming\Microsoft\Window... 2022-12-23 18:39:39.656 2022-12-23 18:36:09.885 2022-12-23 18:39:39.656 209.771 0.000

34172 rows × 6 columns

We will only look at .ps1 and .psm1 files because these files are created by the malicious process and rather unusual.

In [40]:
only_powershell = usable[(usable['TargetFilename'].str.endswith('.ps1')) | usable['TargetFilename'].str.endswith('.psm1')].reset_index()
only_powershell
Out[40]:
index TargetFilename DeletionTime CreateTime UpdateTime CreateDelta UpdateDelta
0 261 C:\Windows\Temp\__PSScriptPolicyTest_mj1liz3f.... 2022-12-09 10:09:48.710 2022-12-09 10:09:48.655 2022-12-09 10:09:48.655 0.055 0.055
1 262 C:\Windows\Temp\__PSScriptPolicyTest_jiygjqrb.... 2022-12-09 10:09:48.711 2022-12-09 10:09:48.656 2022-12-09 10:09:48.656 0.055 0.055
2 382 C:\Windows\System32\WindowsPowerShell\v1.0\Mod... 2022-12-09 10:18:41.302 2022-12-01 14:48:02.052 2022-12-09 10:18:41.303 675039.250 -0.001
3 412 C:\Windows\SystemTemp\448EA551-9FF6-4E24-9C07-... 2022-12-09 10:18:51.069 2022-12-09 10:18:32.711 2022-12-09 10:18:32.711 18.358 18.358
4 824 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-09 10:41:59.492 2022-12-09 10:41:59.457 2022-12-09 10:41:59.457 0.035 0.035
5 825 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-09 10:41:59.493 2022-12-09 10:41:59.457 2022-12-09 10:41:59.457 0.036 0.036
6 834 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.587 2022-12-09 10:41:58.243 2022-12-09 10:41:58.262 5.344 5.325
7 840 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.624 2022-12-09 10:41:58.321 2022-12-09 10:41:58.323 5.303 5.301
8 841 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.624 2022-12-09 10:41:58.323 2022-12-09 10:41:58.323 5.301 5.301
9 842 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.624 2022-12-09 10:41:58.323 2022-12-09 10:41:58.323 5.301 5.301
10 843 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.624 2022-12-09 10:41:58.323 2022-12-09 10:41:58.323 5.301 5.301
11 844 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.642 2022-12-09 10:41:58.355 2022-12-09 10:41:58.355 5.287 5.287
12 845 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.645 2022-12-09 10:41:58.359 2022-12-09 10:41:58.359 5.286 5.286
13 846 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.645 2022-12-09 10:41:58.364 2022-12-09 10:41:58.364 5.281 5.281
14 847 C:\Windows\Temp\SDIAG_db5d8708-c972-4fdb-bbd5-... 2022-12-09 10:42:03.645 2022-12-09 10:41:58.364 2022-12-09 10:41:58.364 5.281 5.281
15 3273 C:\Windows\Temp\__PSScriptPolicyTest_oudiyrti.... 2022-12-16 09:58:35.467 2022-12-16 09:58:35.333 2022-12-16 09:58:35.333 0.134 0.134
16 3274 C:\Windows\Temp\__PSScriptPolicyTest_0njfuqwh.... 2022-12-16 09:58:35.469 2022-12-16 09:58:35.336 2022-12-16 09:58:35.336 0.133 0.133
17 3753 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-16 10:37:54.150 2022-12-16 10:37:54.065 2022-12-16 10:37:54.065 0.085 0.085
18 3754 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-16 10:37:54.157 2022-12-16 10:37:54.069 2022-12-16 10:37:54.069 0.088 0.088
19 3758 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.668 2022-12-16 10:37:49.470 2022-12-16 10:37:49.476 6.198 6.192
20 3764 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.759 2022-12-16 10:37:49.528 2022-12-16 10:37:49.538 6.231 6.221
21 3765 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.546 2022-12-16 10:37:49.546 6.218 6.218
22 3766 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.566 2022-12-16 10:37:49.566 6.198 6.198
23 3767 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.592 2022-12-16 10:37:49.592 6.172 6.172
24 3768 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.614 2022-12-16 10:37:49.614 6.150 6.150
25 3769 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.659 2022-12-16 10:37:49.660 6.105 6.104
26 3770 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.675 2022-12-16 10:37:49.675 6.089 6.089
27 3771 C:\Windows\Temp\SDIAG_aa7ab891-aa70-44ac-ba0a-... 2022-12-16 10:37:55.764 2022-12-16 10:37:49.675 2022-12-16 10:37:49.675 6.089 6.089
28 10115 C:\$WinREAgent\Scratch\Mount\Windows\WinSxS\am... 2022-12-16 10:46:00.032 2022-12-16 10:44:01.002 2022-12-16 10:44:01.002 119.030 119.030
29 10128 C:\$WinREAgent\Scratch\Mount\Windows\WinSxS\am... 2022-12-16 10:46:00.074 2022-12-16 10:44:01.002 2022-12-16 10:44:01.002 119.072 119.072
30 10388 C:\$WinREAgent\Scratch\Mount\Windows\WinSxS\am... 2022-12-16 10:46:02.437 2022-12-16 10:43:49.524 2022-12-16 10:43:49.524 132.913 132.913
31 10394 C:\$WinREAgent\Scratch\Mount\Windows\WinSxS\am... 2022-12-16 10:46:02.456 2022-12-16 10:43:49.536 2022-12-16 10:43:49.536 132.920 132.920
32 21283 C:\Windows\Temp\__PSScriptPolicyTest_zaksyazf.... 2022-12-19 08:23:25.208 2022-12-19 08:23:25.107 2022-12-19 08:23:25.107 0.101 0.101
33 21284 C:\Windows\Temp\__PSScriptPolicyTest_xn2yn2xz.... 2022-12-19 08:23:25.208 2022-12-19 08:23:25.107 2022-12-19 08:23:25.107 0.101 0.101
34 22827 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-19 12:28:41.122 2022-12-19 12:28:41.085 2022-12-19 12:28:41.085 0.037 0.037
35 22828 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-19 12:28:41.122 2022-12-19 12:28:41.085 2022-12-19 12:28:41.085 0.037 0.037
36 22830 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.605 2022-12-19 12:28:39.267 2022-12-19 12:28:39.267 3.338 3.338
37 22837 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.620 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.281 3.281
38 22838 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.634 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.295 3.295
39 22839 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.634 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.295 3.295
40 22840 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.634 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.295 3.295
41 22841 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.634 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.295 3.295
42 22842 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.634 2022-12-19 12:28:39.339 2022-12-19 12:28:39.339 3.295 3.295
43 22843 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.642 2022-12-19 12:28:39.397 2022-12-19 12:28:39.399 3.245 3.243
44 22844 C:\Windows\Temp\SDIAG_fe2bdbdd-315a-492d-9580-... 2022-12-19 12:28:42.642 2022-12-19 12:28:39.402 2022-12-19 12:28:39.402 3.240 3.240
45 23886 C:\Windows\Temp\__PSScriptPolicyTest_s5uualr4.... 2022-12-20 08:56:57.505 2022-12-20 08:56:57.470 2022-12-20 08:56:57.470 0.035 0.035
46 23887 C:\Windows\Temp\__PSScriptPolicyTest_bmkk4az5.... 2022-12-20 08:56:57.521 2022-12-20 08:56:57.472 2022-12-20 08:56:57.472 0.049 0.049
47 32061 C:\Windows\Temp\__PSScriptPolicyTest_qbrmgl21.... 2022-12-22 17:10:17.780 2022-12-22 17:10:17.722 2022-12-22 17:10:17.722 0.058 0.058
48 32062 C:\Windows\Temp\__PSScriptPolicyTest_axydttzb.... 2022-12-22 17:10:17.785 2022-12-22 17:10:17.722 2022-12-22 17:10:17.722 0.063 0.063
49 32253 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:12:58.766 2022-12-22 17:12:58.676 2022-12-22 17:12:58.676 0.090 0.090
50 32254 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:12:58.769 2022-12-22 17:12:58.676 2022-12-22 17:12:58.681 0.093 0.088
51 32257 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:13:00.368 2022-12-22 17:13:00.356 2022-12-22 17:13:00.356 0.012 0.012
52 32258 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:13:00.368 2022-12-22 17:13:00.356 2022-12-22 17:13:00.356 0.012 0.012
53 32960 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 14:18:14.867 2022-12-23 14:18:14.469 2022-12-23 14:18:14.469 0.398 0.398
54 32961 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 14:18:14.867 2022-12-23 14:18:14.469 2022-12-23 14:18:14.469 0.398 0.398
55 33587 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 15:28:46.761 2022-12-23 15:28:46.660 2022-12-23 15:28:46.660 0.101 0.101
56 33588 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 15:28:46.761 2022-12-23 15:28:46.660 2022-12-23 15:28:46.660 0.101 0.101
In [37]:
display(only_powershell[only_powershell['TargetFilename'].str.contains('__PSScriptPolicyTest_b35xidpj')])

alerts = only_powershell[(only_powershell['CreateDelta'] < .5) & (only_powershell['UpdateDelta'] < .5)].reset_index(drop=True)

display(alerts)
index TargetFilename DeletionTime CreateTime UpdateTime CreateDelta UpdateDelta
53 32960 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 14:18:14.867 2022-12-23 14:18:14.469 2022-12-23 14:18:14.469 0.398 0.398
index TargetFilename DeletionTime CreateTime UpdateTime CreateDelta UpdateDelta
0 261 C:\Windows\Temp\__PSScriptPolicyTest_mj1liz3f.... 2022-12-09 10:09:48.710 2022-12-09 10:09:48.655 2022-12-09 10:09:48.655 0.055 0.055
1 262 C:\Windows\Temp\__PSScriptPolicyTest_jiygjqrb.... 2022-12-09 10:09:48.711 2022-12-09 10:09:48.656 2022-12-09 10:09:48.656 0.055 0.055
2 824 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-09 10:41:59.492 2022-12-09 10:41:59.457 2022-12-09 10:41:59.457 0.035 0.035
3 825 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-09 10:41:59.493 2022-12-09 10:41:59.457 2022-12-09 10:41:59.457 0.036 0.036
4 3273 C:\Windows\Temp\__PSScriptPolicyTest_oudiyrti.... 2022-12-16 09:58:35.467 2022-12-16 09:58:35.333 2022-12-16 09:58:35.333 0.134 0.134
5 3274 C:\Windows\Temp\__PSScriptPolicyTest_0njfuqwh.... 2022-12-16 09:58:35.469 2022-12-16 09:58:35.336 2022-12-16 09:58:35.336 0.133 0.133
6 3753 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-16 10:37:54.150 2022-12-16 10:37:54.065 2022-12-16 10:37:54.065 0.085 0.085
7 3754 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-16 10:37:54.157 2022-12-16 10:37:54.069 2022-12-16 10:37:54.069 0.088 0.088
8 21283 C:\Windows\Temp\__PSScriptPolicyTest_zaksyazf.... 2022-12-19 08:23:25.208 2022-12-19 08:23:25.107 2022-12-19 08:23:25.107 0.101 0.101
9 21284 C:\Windows\Temp\__PSScriptPolicyTest_xn2yn2xz.... 2022-12-19 08:23:25.208 2022-12-19 08:23:25.107 2022-12-19 08:23:25.107 0.101 0.101
10 22827 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-19 12:28:41.122 2022-12-19 12:28:41.085 2022-12-19 12:28:41.085 0.037 0.037
11 22828 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-19 12:28:41.122 2022-12-19 12:28:41.085 2022-12-19 12:28:41.085 0.037 0.037
12 23886 C:\Windows\Temp\__PSScriptPolicyTest_s5uualr4.... 2022-12-20 08:56:57.505 2022-12-20 08:56:57.470 2022-12-20 08:56:57.470 0.035 0.035
13 23887 C:\Windows\Temp\__PSScriptPolicyTest_bmkk4az5.... 2022-12-20 08:56:57.521 2022-12-20 08:56:57.472 2022-12-20 08:56:57.472 0.049 0.049
14 32061 C:\Windows\Temp\__PSScriptPolicyTest_qbrmgl21.... 2022-12-22 17:10:17.780 2022-12-22 17:10:17.722 2022-12-22 17:10:17.722 0.058 0.058
15 32062 C:\Windows\Temp\__PSScriptPolicyTest_axydttzb.... 2022-12-22 17:10:17.785 2022-12-22 17:10:17.722 2022-12-22 17:10:17.722 0.063 0.063
16 32253 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:12:58.766 2022-12-22 17:12:58.676 2022-12-22 17:12:58.676 0.090 0.090
17 32254 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:12:58.769 2022-12-22 17:12:58.676 2022-12-22 17:12:58.681 0.093 0.088
18 32257 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:13:00.368 2022-12-22 17:13:00.356 2022-12-22 17:13:00.356 0.012 0.012
19 32258 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-22 17:13:00.368 2022-12-22 17:13:00.356 2022-12-22 17:13:00.356 0.012 0.012
20 32960 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 14:18:14.867 2022-12-23 14:18:14.469 2022-12-23 14:18:14.469 0.398 0.398
21 32961 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 14:18:14.867 2022-12-23 14:18:14.469 2022-12-23 14:18:14.469 0.398 0.398
22 33587 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 15:28:46.761 2022-12-23 15:28:46.660 2022-12-23 15:28:46.660 0.101 0.101
23 33588 C:\Users\User\AppData\Local\Temp\__PSScriptPol... 2022-12-23 15:28:46.761 2022-12-23 15:28:46.660 2022-12-23 15:28:46.660 0.101 0.101
In [ ]:
list(only_powershell[only_powershell['Cluster'] == 4].reset_index(drop=True)['TargetFilename'])[-5:-1]
Out[ ]:
['C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_jrydxbmj.a5t.psm1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_b35xidpj.e2q.ps1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_cprwb2zm.jcw.psm1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_giuslhqj.bbb.ps1']

Computing true/false postives/negatives¶

In this section, we compute the number of true positives, true negatives, false positives and false negatives, as well as some metrics related to these quantities. Before we continue, it is useful to define all of these quantities:

  • The number of true positives is the number of images (which are run as a process) which use exactly one port, and which are in the process tree of a known malware process;
  • The number of true negatives is the number of images (which are run as a process) which use more than one port, and which are not in the process tree of a known malware process;
  • The number of false positives is the number of images (which are run as a process) which use exactly one port, and which are not in the process tree of a known malware process;
  • The number of false negatives is the number of images (which are run as a process) which use more than one port, and which are in the process tree of a known malware process.

For the purposes of the above definition, the known malware processes are those associated with one of the following images:

  • C:\Users\User\Downloads\2ecbf5a27adc238af0b125b985ae2a8b1bc14526faea3c9e40e6c3437245d830.exe
  • C:\Users\User\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\Systdeeem.exe
Note: the following 5 images are in the process trees of any of the known malware processes:
  • C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_jrydxbmj.a5t.psm1
  • C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_b35xidpj.e2q.ps1
  • C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_cprwb2zm.jcw.psm1
  • C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_giuslhqj.bbb.ps1
In [23]:
# These are the malware filenames, as given above, but encoded into properly 'formatted' strings for use in Python:
malware_filenames = set([
    'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_jrydxbmj.a5t.psm1',
    'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_b35xidpj.e2q.ps1',
    'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_cprwb2zm.jcw.psm1',
    'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_giuslhqj.bbb.ps1'
])
malware_filenames
Out[23]:
{'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_b35xidpj.e2q.ps1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_cprwb2zm.jcw.psm1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_giuslhqj.bbb.ps1',
 'C:\\Users\\User\\AppData\\Local\\Temp\\__PSScriptPolicyTest_jrydxbmj.a5t.psm1'}

Using the set of malware filenames given above, the number of true positives is the number of filenames which occur in this set, which is computed below:

In [45]:
true_positives = len(malware_filenames.intersection(set(only_powershell['TargetFilename'])))
true_positives
Out[45]:
4

To compute the number of false negatives, we simply take the set difference instead:

In [30]:
false_negatives = len(malware_filenames.difference(set(only_powershell['TargetFilename'])))
false_negatives
Out[30]:
0

Next, the number of false positives is the number of elements which are not in the set of malware filenames, but which are selected as alerts of filenames:

In [38]:
false_positives = len(set(alerts['TargetFilename']).difference(malware_filenames))
false_positives
Out[38]:
20

Finally, the number of true negatives is the number of files which have a deletion event and an accompanying creation event, but which have not been detected as a true positive, false negative or false positive. This gives:

In [43]:
true_negatives = len(usable['TargetFilename']) - true_positives - false_negatives - false_positives
true_negatives
Out[43]:
34148

Finally, we compute some metrics using these quantities:

In [46]:
accuracy = (true_positives + true_negatives) / (true_positives + false_positives + true_negatives + false_negatives)
precision = true_positives / (true_positives + false_positives)
recall = true_positives / (true_positives + false_negatives)
FPR = false_positives / (false_positives + true_negatives) # false positive rate
TNR = true_negatives / (false_positives + true_negatives)
F1_score = 2 * precision * recall / (precision + recall)

print("Accuracy            = " + "{0:.3f}".format(accuracy))
print("Precision           = " + "{0:.3f}".format(precision))
print("Recall              = " + "{0:.3f}".format(recall))
print("False Positive Rate = " + "{0:.3f}".format(FPR))
print("True  Negative Rate = " + "{0:.3f}".format(TNR))
print("F1-score            = " + "{0:.3f}".format(F1_score))
Accuracy            = 0.999
Precision           = 0.167
Recall              = 1.000
False Positive Rate = 0.001
True  Negative Rate = 0.999
F1-score            = 0.286