The Xilinx ISE Installer Lies


For various reasons, I wanted to run Xilinx ISE (predecessors to Vivado) in a parallel and portable manner (think of something like running place and route on thousands of digital designs for FPGAs). Now, ISE is no longer being developed, but they obviously keep some of the recent versions hosted for their gray-haired customers. They have a Linux installer, but it’s always a pain to set up, and it can be super annoying to get all the right dependencies and environment set up correctly, so I wanted to make a Docker image so I can do it portably and reproducibly. Cool, sounds like a reasonable idea.
I grabbed the installer for the very last version they ever released, which is a 6.1 GB tar file. Docker has to copy this file into the image every time for the build, which takes more than 30 seconds, which is annoying (I also know about bind mounts during build time, but let’s say I want to include the file with the image). If I also want to host the installer file along with the Dockerfile on GitHub as an asset for the build, I also need to split it into around 60 file chunks, each 90 MB. These issues are all caused by the large file size of the installer, which makes it the main problem to solve.
So I first tried to compress the installer with gzip, but it barely reduced the file size. So I looked into the tar file itself, and almost all the large install files are .zip.xz files. Okay, weird. Zip already supports compression inside the archive format, so why would they use xz to compress presumably uncompressed or compressed zip files? And also, why would they not just use zip with compression or tar and gzip for the installer as a whole?
I then decided to recompress these large .zip.xz files with a higher compression effort or a larger compression window using xz. I.e., I'm going to take all the xz files in the tar file, decompress them, recompress them with the highest compression effort, and put them back into the tar installer file, replacing the original xz files. This way, I can maybe reduce the overall tar file size and solve some of my issues, assuming that Xilinx did not compress these files with the highest effort already. (You would think they would care a lot about this issue since they have to serve this 6 GB installer file for download to users; i.e., the smaller they can make their installer, the more money they can save on bandwidth and network costs of these downloads) (I'm also making an educated guess that Xilinx doesn't care enough to do this).
Well, it turns out the ISE installer lies. These .zip.xz files are 7zip files, which are also password protected. Naturally, I found out what the password is (I won’t say it here, but there are several ways to find this out that are public). Now the new plan is to do the same recompression idea as before but recompress these 7zip files with the highest compression effort I can (using the delta filter + the supported lzma2 compression filter with the highest compression level).
I tried to do this on disk on my laptop by un-taring and un-7zipping the individual .zip.xz files into a temporary directory and then re-7zipping and recompressing them, and then reassembling the tar file with the new .zip.xz files. I ran out of disk space on my laptop (and there is I/O overhead). So then I decided to do this all in memory using Python since there are nice libraries for tar and 7zip, and I can selectively read out parts of the tar file and do everything in memory and possibly in parallel. I then crashed my laptop via OOM. I was able to print out some metrics while this ran, and I could see some files would only result in close to 0% file size reduction while some were closer to 6%. So it does work a little sometimes; I just haven’t run it all the way through to get the final overall file size reduction of the entire installer.
I'm still curious if I can make this work well. The recompression itself is also super slow, but I think the installer being 6 GB is so annoying that it’s a worthy issue for me to pester myself over. Previously, I was doing all this in Python with the py7zr library, but there is also a 7zip library in Rust that looks more promising, and I think there is a way to support zstd compression, but I’m not sure if the official 7zip tool supports zstd or whatever the Xilinx installer’s 7zip supports.
I guess the conclusion here is that Xilinx never ceases to "surprise" me with their software, not even with something as mundane as the installer.