Database Vulnerability Scanning

At cPanel Conference, I was queried to see if there was any database scanning done in addition to the malware scanning I perform on automated migrations. I’m glad this question got asked, because I had never actually considered it before.

A quick search around the internet shows that not a lot of other people had considered it before, either.

Basically, all I could find in instructions for checking databases for malware is:

Open up PHPMyAdmin
Look for anything ‘weird’

It’s that simple folks.

What’s ‘weird’ you ask? Well, anything that’s ‘not supposed to be there’. Helpful.

What should we be looking for?

Obviously, this set of instructions is not very intuitive; we cannot automatically scan databases for ‘weird’. We need to narrow that down a bit.

One post did include some examples of items that are consistently malware-y, including cases of executable php stored in the database. I asked a few members of our security team what they might put on a list of php functions that don’t belong in a database, and we ended up with the following functions:

eval()
base64_decode()
gzinflate()
error_reporting(0)
shell_exec()
str_rot13()

Most of these are functions that a competent server admin (that’s you) should be adding to the php disable_functions setting for security following their migration, because they are generally unsafe (they run shell commands) or usually unnecessary (obfuscate code that is easily decrypted).

This is not a completely exhaustive list, because hackers are extremely crafty, but it should be a good enough start to catch common instances of malware or backdoors stored in databases.

Searching the database

The easiest way for me, a non-DBA, to search a database for a certain string is just to grep a database dump. The database has to be dumped anyway to migrate it, so its an opportune time. Here is my grep code, which simply outputs the name of the file if it matches any expression.

grep -l -Ei \
-e 'eval[ ]?\(' \
-e 'base64_decode[ ]?\(' \
-e 'gzinflate[ ]?\(' \
-e 'error_reporting[ ]?\((0|off)\)' \
-e 'shell_exec[ ]?\(' \
-e 'str_rot13[ ]?\(' \
$filename

The regex allows spaces to exist between the function name and the function text, such as exec ( and exec(, and the portion after error_reporting will trigger if the hacker used either 0 or off to turn off the display of php errors.

This specific usage is handy for looping through a list of files, such as those files stored in /home/dbdumps in my example, and grep will only print the filenames that triggered a match.

for filename in $(ls /home/dbdumps); do
grep -l -Ei \
-e 'eval[ ]?\(' \
-e 'base64_decode[ ]?\(' \
-e 'gzinflate[ ]?\(' \
-e 'error_reporting[ ]?\((0|off)\)' \
-e 'shell_exec[ ]?\(' \
-e 'str_rot13[ ]?\(' \
$filename
done

Adding a scan to a database stream

From our lesson on syncing databases, we know that we can stream a database over SSH. A basic knowledge of GNU also tells me that the tee function allows a second copy of the output to be directed to a different place. So, adding to the pipeline lets us perform a scan in the same step as bringing the database to the new server.

# our original command, run from the target server
ssh -C host3 "mysqldump devmca_wp902" | mysql devmca_wp902

# the modified command, including the tee to scan
ssh -C host3 "mysqldump devmca_wp902" | \
tee >(if grep -Ei \
-e 'eval[ ]?\(' \
-e 'base64_decode[ ]?\(' \
-e 'gzinflate[ ]?\(' \
-e 'error_reporting[ ]?\((0|off)\)' \
-e 'shell_exec[ ]?\(' \
-e 'str_rot13[ ]?\(' \
&> /dev/null; then
echo devmca_wp902 >> /root/dbmalware.txt
fi) | \
mysql devmca_wp902

The subshell cannot contain any output to terminal, otherwise this would interrupt the import if anything was detected.

The name of the database can easily be substituted with a shell variable for usage against a number of database names.

So does it work?

Yes and no. It does exactly what it’s supposed to.

The scan command has detected whether any of these functions were stored in a database, and when inline with a database stream, the command does not interrupt the import, as I originally desired (the tech would be notified at the end of the automated migration whether any databases were triggered).

However, there are some drawbacks. The grep of course does nothing about the fact that a bad function was detected. And, it doesn’t say what the bad function was (or does). It’s merely intended as an indicator that this database will require additional investigation. It also does not outright exclude any database that doesn’t trigger an expression, because it’s only looking for certain things.

I was really hoping that I would be able to find a migration organically that actually contained malware so that I could show you, but I was only able to find false positives. Yes, there were cases where the functions I’m searching for were legitimately inside of the database. In each case, the CMS that uses the database was poorly coded to obfuscate some passwords with a two-way cipher rather than using a one-way hash.

It’s true that there are no rules about coding. You can do literally anything you want, even if it’s not a good idea!