Learn how to securely validate image files in React using magic bytes. To mitigate risks, combine client-side checks with robust server-side validation.
Image uploads are a common feature while building web applications but can pose security risks if not validated properly. While validation is done on the server to ensure maximum security, client-side validation does not hurt.
When we submit a file to the form, its extension determines its mime type. Thus, the file's mime type can also be changed by changing the file extension.
For this form, let's say we upload a pdf to this form.
The type is rightly shown as expected. But well, what if we change the extension from .png to .pdf.
The type is now application/pdf although it's an image file. This makes sense because you cannot expect the browser to actually go through the inner contents of the file to determine the file type.
Magic Bytes
Magic Bytes are the first bits of a file that uniquely identify its type. These sequences help identify the file type without relying on the file extension, which we just saw above can easily be manipulated. For example:
The JPEG image format typically starts with bytes FF D8 FF.
The PNG image format typically starts with bytes 89 50 4E 47 0D 0A 1A 0A.
Magic Bytes, also known as file signatures or magic bytes, are specific sequences of bytes at the beginning of a file that uniquely identify the file type or format. These sequences are often used to determine how to handle or interpret the file.
Let's extend our previous code to show text from the byte pattern.
Add Byte Patterns
Here, we import mime type and their respective pattern for identification. The isMime function checks if the pattern matches with the file content.
Get Mime Type from Pattern
To retrieve the MIME type of the uploaded file, we read the required portion of the file content to minimize memory usage and processing time. We then use ts-pattern library to compare the initial bytes of the file content with the predefined patterns.
Integrate With Form Component
Now, we can use our code getImageMimeType instead of relying on the extension for retrieving the mime type for the file and displaying the file's mime type.
As we can see, even though the extension of the file is .pdf the mime type is rightly shown as image/png.
Conclusion
This way, using magic numbers for MIME-type validation provides an effective initial step for file validation. However, it's essential to acknowledge that the magic number validation should be a starting point rather than solely relying on them for file type validation. We also need to ensure that proper checks are done on the server and should not rely on validation from the client side only. We need additional checks and validation to ensure unwanted or harmful file types are not inadvertently stored or processed.
Thank you for reading this article. Catch you in the next one.