import Gist from "react-gist";
import { Body, Subtitle } from "../../components/Typography";

export const GitlabDependencyCaching = () => {
  return (
    <>
      <Subtitle>The Problem</Subtitle>
      <Body>
        Anyone who has done any amount of work in the Javascript world knows
        that the node_modules folder can get quite large. This is especially
        true when you have a monorepo with many packages. In our case, we have a
        monorepo with over 100 packages, and the node_modules folder was getting
        quite large. This was causing our CI/CD pipelines to take a long time to
        install dependencies, which was causing our builds and deployment times
        to skyrocket.
      </Body>
      <Subtitle>Any Free Lunch?</Subtitle>
      <Body>
        From the gitlab documentation, we found that the recommended way to fix
        this problem (like most other CI problems) was to use the built in
        caching. The solution is simple, just point gitlab at your dependency
        folder, gitlab will zip that folder up and populate future builds with
        that folder!
      </Body>
      <Gist id="f52b315ac28353d4ad291445214cfd88" />
      <Body>
        This is great... but quickly falls on it's face whenever you try to do
        something interesting. By interesting I mean install more than just 10
        packages. The problem we quickly run into here is in how gitlab stores
        it's cache. It stores the cache in a zip file, and when you try to zip a
        LOT of files...zip starts to slow down. Now the choice of zipping up the
        files is good in theory, but in practice, especially in 2024, it's
        almost always faster to just download a larger file than it is to
        download a smaller file and then unzip it.
      </Body>
      <Subtitle>The Solution</Subtitle>
      <Body>
        Well, I wish it was something more exciting, but the solution we have to
        this is just TAR... and a little bit of bash. Let's tweak the above
        configuration to use tar instead of zip.
      </Body>
      <Gist id="d607326705e7a40521e2599bd2d4938d" />
      <Body>
        With this change, we saw an improvement in the following areas:
        <ul>
          <li>Pushing a new cache archive decreased from 1m to ~10s</li>
          <li>Restoring the cache decreased from 1m 30s to ~10s</li>
        </ul>
        It should be noted that the majority of the gains from the use of tar
        here are not actually from the download time, but from the time it takes
        to move files into their appropriate locations. 
      </Body>
    </>
  );
};
