Medigate Emerges on Top of IoT Security Vendors. Read the KLAS 2020 Report

Worried about Ryuk? Start remediating immediately with Medigate. Learn More

ARTICLE

Lexmark Printers Firmware Extraction – Part C

Dan Hazeneshprong

13 Oct, 2020 • 14 minutes read

Digging Deeper into the Printer’s Firmware

In Parts A and B of this three part blog series, we bought a Lexmark MS811 printer, extracted its Flash image and parsed it into a bootloader and file system. Now that we have some code to work with, the goal is to figure out how the printer unpacks its firmware updates.

Once we understand how to parse firmware updates, we’ll be able to download updates for other printer types, extract them, and analyze their code to understand the printer’s OS, functionality, network presence and more!

Downloading Firmware

The latest updates from the Lexmark support website can be found at http://support.lexmark.com/. We looked for the latest update for the MS811 model we were working with and found: LW75.DN2.P043.zip.

When we unpacked it, we discovered it contained:

~/Downloads$ md5sum LW75.DN2.P043.zip
de0689ab8d4547df50b368609640250b  LW75.DN2.P043.zip
~/Downloads$ unzip LW75.DN2.P043.zip
Archive:  LW75.DN2.P043.zip
extracting: ReleaseNotes_LW75.xx.P043.pdf
extracting: LW75.DN2.P043.FDN.DN.E732.fls
extracting: README_License.zip
extracting: README_Updating_Firmware_v3.pdf

The file that we were most interested in was the one with the FLS extension. It is seemingly written in what’s called printer job language (PJL), but it contained an unknown command LPROGRAMRIP followed by what looks like an encrypted blob:

@PJL LPROGRAMRIP SOCKET=1 KERNELCOUNT=1634560 TYPECOUNT=76194544 KERNELENCR=3 FKSIGNSZ=1631249 PFID=814 TYPE=MAIN RIPNAME="den24gr" ^G^?<96>__d_(^E___#___^__f%_^R_l (...)

This blob continues until the end of the file. The commands preceding this blob are all comments. This command – LPROGRAMRIP – is not part of the vanilla PJL. However, we knew the Lexmark MS811 must extract its firmware from this file somehow. Looking for references to LPROGRAMRIP in our extracted MS811 file system, we could find the following files:

/MS811/bin/cramfs$ rgrep LPROGRAMRIP
Binary file cramfs2/cramfs2_tmp/bin/Page4 matches
Binary file cramfs2/cramfs2_tmp/bin/checkfls matches

Both of these files were ARM ELF executables:

checkfls: ELF 32-bit LSB executable, ARM, EABI4 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.3, for GNU/Linux 2.6.16, stripped
Page4: ELF 32-bit LSB executable, ARM, EABI4 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.3, for GNU/Linux 2.6.16, stripped

We hoped one of these files was in charge of unpacking the FLS file.

Option 1: checkfls

The checkfls program is a small program that opens a file and looks for PJL commands by searching for the string “@PJL”. If found, it will check the command’s name and act accordingly. We saw that it calls a special function for “@PJL LPROGRAMRIP.” It also checks for the following commands, but stops the program if it sees any of them:

  • LPROGRAMSCAN
  • LPROGRAMUICC
  • LPROGRAMDBCS
  • LDOWNLOADBUNDLE
  • LPROGRAMADAPTOR
  • LPROGRAM

For LPROGRAMRIP, the program seems to take the encrypted blob after LPROGRAMRIP and compare its first bytes against a few values:

Initially, we assumed these were possible barkers/magics. However, on closer inspection of the code, it seemed the check was only successful for 0xE8146000:

We looked to see if this matched up with our file:

Unfortunately, it did not. The blob did not start with any of the four barkers, which is strange. It appeared this FLS file would have failed this test on our printer, so we decided more research was required. Regardless, that was the extent of the code of interest that we found in the checkfls program. It was time to move on to the next option.

Option 2: Page4

Page4 was a much larger program. Thankfully, it came with symbols.

The function pjlHandlerLProgramRIP, true to its name, handled the PJL LPROGRAMRIP command. It parsed its arguments and passed them to the pjlProgramRipCode function, which is a very large function that appeared to handle the entire extraction, in multiple phases. It contained a function that compared something in the file against four values, similar to the logic in checkfls.

This time it was easier to understand: it used the “RIPNAME” parameter to compare against four strings: “den24gr,” “denXXgre,” “OptraDne,” and “PCL Emulation Fonts.” According to the printer’s configuration, it will decide whether the RIPNAME is valid. If not, it will print “Invalid Version” and exit.

In addition, we identified a function that seemed to load an encryption key, according to the LPROGRAMRIP “KERNELENCR” argument. This meant we were on the right track.

The function – pjlProgramRipCode – was pretty elaborate, containing references to HMAC, RSA and AES library functions. It seemed to parse the FLS file in several steps, parsing the kernel section according to KERNELCOUNT, and then the software section according to TYPECOUNT or RIPCOUNT (TYPECOUNT by default). The following is a summary of the flow in general terms:

  1. Check the “PFID” argument against $PFID in env.
  2. Check the “RIPNAME” argument against several version strings.
  3. Load an encryption key.
  4. For the kernel section:
    1. Read the header, signature and AES key associated with the kernel section, and decrypt them using the above key.
    2. Read and decrypt the kernel section data using the above AES key and write to “/var/tmp/fkernel.”
    3. Perform certificate checks according to the data footer.
  5. If a second section exists, repeat (4).
File Input/Output

When looking at the flow, there didn’t seem to be any calls to open() or read(). We wanted to understand how the code will read data from the FLS file, so we looked at functions like sf_DecryptHeader for guidance. This function took the data at arg1 and eventually sent it to RSA_public_decrypt. But, we weren’t sure where this data came from?

We saw the buffer that was sent as arg1 was allocated and then sent to a function to be populated. For purposes of this analysis, we named the populating function “f_read_from_pjl”. It appeared to perform some buffer copying, and called a function named pjlGetIp to get a source address to copy from. While the function pjlGetIp seemed interesting, it simply returned a global variable. It was fair to guess, due to the name and usage, that pjlGetIp contained some sort of file handler for the PJL/FLS file, and that the purpose of f_read_from_pjl was to read data from the file.

The Encryption Keys in Data

The next thing we looked for was the key the program used for RSA decryption. There is a named function called LoadSecurityKey that is used from within the pjlProgramRipCode function. Among other things, LoadSecurityKey copies a data address onto an output parameter (higlighted below):

This data address points to an array of RSA keys:

There isn’t one single RSA key but several RSA keys that are there for redundancy. They are stored one after another in the following format:

struct s_SecurityKeyinMem
{
int meta;
int type;
int field_8;
int size;
char data[270];
__int16 field_11E;
};

The “meta” field seemed to define a group of keys – ten keys in all. The first group, meta=0, contains six keys. Together, the meta and type fields uniquely identify a key: they are (0, 5), (0, 4), (0, 3), (0, 2), (0, 1), (0, 0), (1, 2), (1, 1), (1, 0), (2, 0).

The function that loads these keys from memory is LoadSecurityKey. It chooses which key to load based on arg1=meta and arg2=type. In our flow, this function was called with meta=0 and type=KERNELENCR. Since our FLS file specified KERNELENCR=3, the key (0, 3) would be loaded.

How was this key used? It was a 2048 bit RSA key that’s stored in DER format. We wanted to try to load this key ourselves. In order to convert it to PEM format, which most libraries support, we converted the 270 data bytes to base64 and wrapped the result with the strings “—–BEGIN RSA PUBLIC KEY—–” and “—–END RSA PUBLIC KEY—–“.

At this point it’s important to note that these keys are actually public keys – the program eventually passes them on to the library function RSA_public_decrypt. It makes sense the manufacturers encrypted the signature and AES key with the RSA private key, to stop any attempts at falsifying a firmware update. Meanwhile, to extract an update, all that’s needed is the matching public key, which we had. We tried loading the key:

In [2]: from Crypto.PublicKey import RSA
In [3]: RSA.importKey(rsa_pem)
Out[3]: <_RSAobj @0x7fd78d695390 n(2048)

It worked! Now, we looked at how the program used it.
Decrypting the Section Headers With RSA

As you will remember, there were two main sections in the file, a required “kernel” section and an optional secondary section, according to symbols in the code. Each section had a header that was RSA encrypted. We took note of a suspicious function that was called twice from pjlProgramRipCode. It did the following:

  1. Loaded the public RSA key from memory,
  2. Decrypted the header and confirmed it’s 296 bytes long,
  3. Decrypted the signature and compared its size to offset 12 of the header,
  4. Decrypted the AES key, compared its size to offset 280 of the header and initialized it,
  5. Read the section’s data size from offset 292 of the header.

We named this function “f_parse_section_header.” It was unclear what the encrypted size of each header was, since the program decrypted the RSA data in chunks. This is approximately how the RSA decryption code worked:

We discovered the chunk size was 256 after some trial and error:

We tried decrypting from the beginning of the blob, where we suspected the section header should be. The PKSC1 standard specifies that chunks are to be padded with random bytes at the beginning of each chunk. However, in our case, the padding seemed to be 0x01 followed by 0xFFs and separated from the chunk data by a null terminator. If we removed the padding from both chunks, their combined size was 296 bytes, which matched the section header size in the code!

The File Format

We were now able to describe the format of the file:

There appeared to be some checksum-like fields at the footer of each section, but we decided not to look into it at this time. It should be sufficient to decrypt the data based on the fields in the section header.

The Section Header

The function sf_DecryptHeader decrypts a section’s header and loads it to a buffer. By following the usage of this buffer and cross-referencing it with symbols, we could figure out some of the struct’s offsets:

off

size

content

8

DWORD

HMAC struct index (usually 0)

12

DWORD

Decrypted signature size

16

DWORD

HMAC key size

20

DWORD

HMAC key

276

DWORD

AES struct index (usually 0)

280

DWORD

Decrypted AES key size

284

DWORD

AES key byte size

288

DWORD

AES CBC or ECB (1 = CBC)

292

DWORD

Decrypted data size

The AES fields will prove more useful to us, since we don’t care about validating the information with HMAC as much as we care about decrypting it with AES.

A word about the “struct index” fields: we found callbacks associated with AES, HMAC and RSA in memory, and they wrapped calls to library functions, such as RSA_public_decrypt. The index fields would allow the code to reference a different set of wrappers, if they were loaded right after the other wrappers. This code was compiled with only one set of wrappers for each AES/HMAC/RSA, so it was less relevant to us.

Loading the AES Key

The function sf_ProcessSymKey within f_parse_section_header read and decrypted the section’s AES key, then stored it in a data structure. We followed this structure and saw that some actions were taken in a later function, called sf_initDecryptData:

  1. Some rolling XOR was performed,
  2. An initialization wrapper was called and populated an AES structure,
  3. The wrapper for AES_set_decrypt_key was called.

Let’s address them one by one. First, the rolling XOR: sf_initDecryptData copied the AES key to a buffer, ‘XORing’ the first byte with 0x49, the second byte with the first byte and so on:

Next, it sent the XOR’d buffer to the init wrapper, which set the encrypt/decrypt and CBC/DCB flags, and copied the XOR’d buffer when CBC was specified. We made an educated guess that the buffer was the CBC initialization vector or IV.

To clarify, AES uses two modes of operation, CBC or EBC. The Cipher Block Chaining (CBC) mode uses an initialization vector. To quote Wikipedia: “In CBC mode, each block of plaintext is XORed with the previous ciphertext block before being encrypted. […] To make each message unique, an initialization vector must be used in the first block.” This rolling XOR is performed for CBC only, which is why we assumed the that buffer we found above was the IV.

Using the AES Key

Closely after the call to sf_initDecryptData, we reached this code:

The code above reads 0x10000 sized chunks from the file and passes them to sf_ProcessData, which calls an AES wrapper which calls AES_cbc_encrypt or AES_ecb_encrypt according to the aforementioned flag. We confirmed the buffer we suspected as the initialization vector was indeed the IV, because it was used as an argument for AES_cbc_encrypt:

Finally, the function sf_VerifyHash was called shortly after the above loop. It finalized the hash of all of the data decrypted using sf_ProcessData and compared it to the section signature. Now, we can say that we won’t be able to construct our own firmware without knowing the RSA private key.

We now know enough about the code to attempt to decrypt the section body!

Conclusion

We concluded the firmware extraction process worked as follows:

  • Jumps to the end of the LPROGRAMRIP command.
  • Chooses the correct RSA public key based on the KERNELENCR parameter.
  • For the kernel section:
    • Decrypts 296 bytes using the RSA key. Derive signature size, AES key, data size and AES mode (CBC/ECB).
    • Decrypts the signature and AES key using RSA.
    • Generates the CBC initialization vector based on the AES key, if relevant.
    • Decrypts the data using the AES key.
  • Repeats the process for the 2nd section, if it exists.

The original code also computes a hash of the section data and compares it against the signature, which was encrypted using a private RSA key. This prevented us from creating our own firmware updates, since we didn’t have the private key.

Extraction Results

We decided to test a script to confirm the extraction process worked as described above:

Success! We extracted these two files from the LW75.DN2.P043 firmware update:

~/lexmark/extracted$ file *
LW75.DN2.P043.FDN.DN.E732.kernel: ELF 32-bit LSB executable, ARM, EABI4 version 1 (SYSV), statically linked, for GNU/Linux 2.6.16, with debug_info, not stripped
LW75.DN2.P043.FDN.DN.E732.other: data

These files, as expected, contained similar data to what we found on the Flash image from Part B. The first file was the u-boot bootloader, while the second file contained little-endian CramFS file-systems. Each firmware update can act like a patch to the current major version, so not all file systems are added to the update to conserve space. We could extract these file systems using Cramfs utilities to be patched according to the inode format described in Part B:

Applicable Firmware Updates

There is an extensive list of the Lexmark printer models and the latest firmware update for each model here:

http://support.lexmark.com/index?page=content&locale=EN&productCode=&segment=SUPPORT&userlocale=EN_US&id=SO4521

Our extraction method works for all of the available updates, which includes updates from the following release series:

  • EC5 (.P617, .P618)
  • EC6 (.P32)
  • EC6.3 (.P817)
  • EC7.5 (LW75.xx.P043)

Related posts:

If you have a firmware update older than EC5, or have any questions, please let us know at research.blog@mediagte.io.

Dan Hazeneshprong

13 Oct, 2020 • 14 minutes read
ABOUT US

Threat Center

View the latest virus alerts and vulnerabilities and get tips on how to mitigate their risks