OSQuery: Difference between revisions

From WilliamsNet Wiki
Jump to navigation Jump to search
(Created page with "== Query your Linux operating system like a database == Linux offers a lot of commands to help users gather information about their host operating system: listing files or di...")
 
(No difference)

Latest revision as of 18:34, 29 August 2021

Query your Linux operating system like a database[edit]

Linux offers a lot of commands to help users gather information about their host operating system: listing files or directories to check attributes; querying to see what packages are installed, processes are running, and services start at boot; or learning about the system's hardware. Each command uses its own output format to list this information. You need to use tools like grep, sed, and awk to filter the results to find specific information. Also, a lot of this information changes frequently, leading to changes in the system's state.

It would be helpful to view all of this information formatted like the output of a database SQL query. Imagine that you could query the output of the ps and rpm commands as if you were querying an SQL database table with similar names.

Fortunately, there is a tool that does just that and much more: Osquery is an open source "SQL powered operating system instrumentation, monitoring, and analytics framework."

Many applications that handle security, DevOps, compliance, and inventory management (to name a few) depend upon the core functionalities provided by Osquery at their heart. Osquery is available for Linux, macOS, Windows, and FreeBSD. Install the latest version for your operating system by following its installation instructions.

osqueryi --version

osqueryi version 4.7.0

Osquery components[edit]

Osquery has two main components:

  • osqueri is an interactive SQL query console. It is a standalone utility that does not need super-user privileges (unless you are querying tables that need that level of access).
  • osqueryd is like a monitoring daemon for the host it is installed on. This daemon can schedule queries to execute at regular intervals to gather information from the infrastructure.

You can run the osqueri utility without having the osqueryd daemon running. Another utility, osqueryctl, controls starting, stopping, and checking the status of the daemon.

Use the osqueryi interactive prompt[edit]

You interact with Osquery much like you would use an SQL database. In fact, osqueryi is a modified version of the SQLite shell. Running the osqueryi command drops you into an interactive shell where you can run commands specific to Osquery, which often start with a .:

osqueryi

Using a virtual database. Need help, type '.help'
osquery>

To quit the interactive shell, run the .quit command to get back to the operating system's shell:

osquery>
osquery> .quit

Find out what tables are available[edit]

As mentioned, Osquery makes data available as the output of SQL queries. Information in databases is often saved in tables. But how can you query these tables if you don't know their names? Well, you can run the .tables command to list all the tables that you can query. If you are a long-time Linux user or a sysadmin, the table names will be familiar, as you have been using operating system commands to get this information:

osquery> .tables

 => acpi_tables
 => apparmor_events
 => apparmor_profiles
 => apt_sources
<< snip >>
 => arp_cache
 => user_ssh_keys
 => users
 => yara
 => yara_events
 => ycloud_instance_metadata
 => yum_sources

osquery>

Check the schema for individual tables[edit]

Now that you know the table names, you can see what information each table provides. As an example, choose processes, since the ps command is used quite often to get this information. Run the .schema command followed by the table name to see what information is saved in this table. If you want to check the results, you could quickly run ps -ef or ps aux and compare the output with the contents of the table:

osquery> .schema processes

CREATE TABLE processes(`pid` BIGINT, `name` TEXT, `path` TEXT, `cmdline` TEXT, `state` TEXT, `cwd` TEXT, `root` TEXT, `uid` BIGINT, `gid` BIGINT, `euid` 
BIGINT, `egid` BIGINT, `suid` BIGINT, `sgid` BIGINT, `on_disk` INTEGER, `wired_size` BIGINT, `resident_size` BIGINT, `total_size` BIGINT, `user_time` 
BIGINT, `system_time` BIGINT, `disk_bytes_read` BIGINT, `disk_bytes_written` BIGINT, `start_time` BIGINT, `parent` BIGINT, `pgroup` BIGINT, `threads` 
INTEGER, `nice` INTEGER, `is_elevated_token` INTEGER HIDDEN, `elapsed_time` BIGINT HIDDEN, `handle_count` BIGINT HIDDEN, `percent_processor_time` BIGINT 
HIDDEN, `upid` BIGINT HIDDEN, `uppid` BIGINT HIDDEN, `cpu_type` INTEGER HIDDEN, `cpu_subtype` INTEGER HIDDEN, `phys_footprint` BIGINT HIDDEN, PRIMARY KEY 
(`pid`)) WITHOUT ROWID;

To drive home the point, use the following command to see the schema for the RPM packages and compare the information with rpm -qa and rpm -qi operating system commands:

osquery> .schema rpm_packages

CREATE TABLE rpm_packages(`name` TEXT, `version` TEXT, `release` TEXT, `source` TEXT, `size` BIGINT, `sha1` TEXT, `arch` TEXT, `epoch` INTEGER, 
`install_time` INTEGER, `vendor` TEXT, `package_group` TEXT, `pid_with_namespace` INTEGER HIDDEN, `mount_namespace_id` TEXT HIDDEN, PRIMARY KEY (`name`, 
`version`, `release`, `arch`, `epoch`, `pid_with_namespace`)) WITHOUT ROWID;


In case that schema information is too cryptic for you, there is another way to print the table information in a verbose, tabular format: the PRAGMA command. For example, I'll use PRAGMA to see information for the rpm_packages table in a nice format. One benefit of this tabular information is that you can focus on the field you want to query and see the type of information that it provides:

osquery> PRAGMA table_info(users);
 
+-----+-------------+--------+---------+------------+----+
| cid | name        | type   | notnull | dflt_value | pk |
+-----+-------------+--------+---------+------------+----+
| 0   | uid         | BIGINT | 1       |            | 1  |
| 1   | gid         | BIGINT | 0       |            | 0  |
| 2   | uid_signed  | BIGINT | 0       |            | 0  |
| 3   | gid_signed  | BIGINT | 0       |            | 0  |
| 4   | username    | TEXT   | 1       |            | 2  |
| 5   | description | TEXT   | 0       |            | 0  |
| 6   | directory   | TEXT   | 0       |            | 0  |
| 7   | shell       | TEXT   | 0       |            | 0  |
| 8   | uuid        | TEXT   | 1       |            | 3  |
+-----+-------------+--------+---------+------------+----+

Run your first query[edit]

Now that you have all the required information from the table, the schema, and the items to query, run your first SQL query to view the information. The query below returns the users that are present on the system and each one's user ID, group ID, home directory, and default shell. Linux users could get this information by viewing the contents of the /etc/passwd file and doing some grep, sed, and awk magic.

osquery> select uid,gid,directory,shell,uuid FROM users LIMIT 7;

+-----+-----+----------------+----------------+------+
| uid | gid | directory      | shell          | uuid |
+-----+-----+----------------+----------------+------+
| 0   | 0   | /root          | /bin/bash      |      |
| 1   | 1   | /bin           | /sbin/nologin  |      |
| 2   | 2   | /sbin          | /sbin/nologin  |      |
| 3   | 4   | /var/adm       | /sbin/nologin  |      |
| 4   | 7   | /var/spool/lpd | /sbin/nologin  |      |
| 5   | 0   | /sbin          | /bin/sync      |      |
| 6   | 0   | /sbin          | /sbin/shutdown |      |
+-----+-----+----------------+----------------+------+

Run queries without entering interactive mode[edit]

What if you want to run a query without entering the osqueri interactive mode? This could be very useful if you are writing shell scripts around it. In this case, you could echo the SQL query and pipe it to osqueri right from the Bash shell:

$ echo "select uid,gid,directory,shell,uuid FROM users LIMIT 7;" | osqueryi

+-----+-----+----------------+----------------+------+
| uid | gid | directory      | shell          | uuid |
+-----+-----+----------------+----------------+------+
| 0   | 0   | /root          | /bin/bash      |      |
| 1   | 1   | /bin           | /sbin/nologin  |      |
| 2   | 2   | /sbin          | /sbin/nologin  |      |
| 3   | 4   | /var/adm       | /sbin/nologin  |      |
| 4   | 7   | /var/spool/lpd | /sbin/nologin  |      |
| 5   | 0   | /sbin          | /bin/sync      |      |
| 6   | 0   | /sbin          | /sbin/shutdown |      |
+-----+-----+----------------+----------------+------+

Summary[edit]

Osquery is a powerful tool that provides a lot of host information that can be used to solve various use cases. You can learn more about Osquery by reading its documentation.