Cloud Native Buildpack Inside Out

Jay Lee
11 min readMar 31, 2022

This is a follow-up to the previous article, Creating Spring Boot container in a minute with Cloud Native Buildpacks for Azure Container Platform. You probably would not need to know the nitty-gritty details of CNB to use it, but it will be useful in case you’re facing any issues or you want to extend the buildpack for your unique requirement.

Let’s start with my question in the previous article. “…If you’re inquisitive enough, it makes you wonder how on earth buildpack knows if it’s a Java project specifically a maven project …” The first thing to remember is that there are separate buildpacks per each technology. For example, there are ASP.NET Core Buildpack, .NET Core Runtime Buildpack, Executable JAR Buildpack, Go Build Buildpack, Go Distribution Buildpack, Nginx Server Buildpack, and so on. And each buildpack has its intelligence to tell whether the source code is something they can understand. This process of scanning source code using its intelligence is called DETECT lifecycle, and there is a total of four major lifecycles in Cloud Native Buildpack— ANALYZE, DETECT, BUILD, EXPORT. We will first start with DETECT then go with the flow.

Buildpack lifecycle

Go to Maven buildpack Github repository, and check the logic of Maven buildpack DETECT https://github.com/paketo-buildpacks/maven/blob/main/maven/detect.go it checks the existence of the pom.xml file and returns Pass: true or false accordingly. See just one more buildpack before we move on. https://github.com/paketo-buildpacks/nginx/blob/main/detect.go Nginx buildpack checks the file nginx.conf to detect the project, then check the value of the environment variable BP_NGINX_VERSION for further verification.

Now, run pack once more and see when the lifecycle DETECT kicks in. ===> in the logs shows the beginning of each lifecycle. From the logs below, pack runs ANALYZE, DETECT, BUILD, EXPORT in sequence. Let’s ignore RESTORE for now.

$ pack build eggboy/springboot:0.0.1 --builder paketobuildpacks/builder:base --buildpack paketo-buildpacks/java-azure
base: Pulling from paketobuildpacks/builder
Digest: sha256:b32ab96a21dd7a013f924d644b4e555b71ae180cf30493286a3a1e3c16c93fa5
Status: Image is up to date for paketobuildpacks/builder:base
base-cnb: Pulling from paketobuildpacks/run
Digest: sha256:bcab6379bf83f0657dab49f08c7da7f23a6b18145352b2e13178e35cf6bd39c1
Status: Image is up to date for paketobuildpacks/run:base-cnb
gcr.io/paketo-buildpacks/java-azure@sha256:ecd2f82e642aa29556c3c6dea97251b807c3299f3e7abc768746bf4b594e2b03: Pulling from paketo-buildpacks/java-azure
Digest: sha256:ecd2f82e642aa29556c3c6dea97251b807c3299f3e7abc768746bf4b594e2b03
Status: Image is up to date for gcr.io/paketo-buildpacks/java-azure@sha256:ecd2f82e642aa29556c3c6dea97251b807c3299f3e7abc768746bf4b594e2b03
===> ANALYZING
...
===> DETECTING
8 of 20 buildpacks participating
paketo-buildpacks/ca-certificates 3.1.0
paketo-buildpacks/microsoft-openjdk 2.2.0
paketo-buildpacks/syft 1.10.0
paketo-buildpacks/maven 6.4.1
paketo-buildpacks/executable-jar 6.1.0
paketo-buildpacks/apache-tomcat 7.3.0
paketo-buildpacks/dist-zip 5.2.0
paketo-buildpacks/spring-boot 5.8.0

===> RESTORING
...
===> BUILDING
...
===> EXPORTING
...

Output said 8 buildpacks are participating, in other words, my maven project is DETECTed by 8 buildpacks like microsoft-openjdk, maven, etc. By the way, where are those buildpacks coming from? How does pack know which buildpacks are available? The answer lies in the command itself, --builder This would be a thing you probably didn’t pay much attention to till now. If you’re observant enough, you must have realized pack always pull the image paketobuildpacks/builder at the beginning of the run. What’s builder got to do with DETECT then? From CNB documentation,

“A builder is an image that contains all the components necessary to execute a build. A builder image is created by taking a build image and adding a lifecycle, buildpacks, and files that configure aspects of the build including the buildpack detection order and the location(s) of the run image”

In short, the builder image runs each lifecycle with buildpacks to produce the app image. We can use dive to check what’s inside the builder image. (I hid unmodified files that make it easier to see the difference between layers)

Layers of paketo builder

dive reveals two interesting things about builder image. 1. builder indeed has loads of buildpacks inside and 2. each buildpack has 2 binaries at least in common, buildand detect. This is an answer to the question of where buildpacks are coming from. You could guess at this point that the builder would iterate over all the buildpacks it contains, and run detect against your source code. Then you might be curious how many buildpacks are contained in the paketo builder. You don’t have to count it yourself. There are about 86 in total. Would it then really run detect 86 times? The answer is of course not. The builder has so-called detection order where they make relevant buildpacks as a group so it doesn’t have to blindly run one by one. pack builder inspect shows this in detail. In the order, (optional) is optional that doesn’t affect the detection.

$ pack builder inspect paketobuildpacks/builder:base
...
Detection Order:
├ Group #1:
│ ├ paketo-buildpacks/ruby@0.11.0
│ │ └ Group #1:
│ │ ├ paketo-buildpacks/ca-certificates@3.0.2 (optional)
│ │ ├ paketo-buildpacks/mri@0.6.0
│ │ ├ paketo-buildpacks/bundler@0.3.1
│ │ ├ paketo-buildpacks/bundle-install@0.3.1
│ │ ├ paketo-buildpacks/node-engine@0.11.4 (optional)
...
├ Group #2:
│ ├ paketo-buildpacks/dotnet-core@0.14.2
│ │ └ Group #1:
│ │ ├ paketo-buildpacks/ca-certificates@3.1.0 (optional)
│ │ ├ paketo-buildpacks/watchexec@2.3.3 (optional)
│ │ ├ paketo-buildpacks/dotnet-core-runtime@0.5.5
│ │ ├ paketo-buildpacks/dotnet-core-aspnet@0.5.4 (optional)
│ │ ├ paketo-buildpacks/dotnet-core-sdk@0.5.6
...
├ Group #3:
│ ├ paketo-buildpacks/go@1.1.0
│ │ └ Group #1:
│ │ ├ paketo-buildpacks/ca-certificates@3.1.0 (optional)
│ │ ├ paketo-buildpacks/watchexec@2.3.3 (optional)
│ │ ├ paketo-buildpacks/go-dist@1.1.0
│ │ ├ paketo-buildpacks/git@0.4.1 (optional)
│ │ ├ paketo-buildpacks/go-mod-vendor@0.5.1
│ │ ├ paketo-buildpacks/go-build@1.0.2
...
├ Group #7:
│ └ paketo-buildpacks/java@6.14.1
│ └ Group #1:
│ ├ paketo-buildpacks/ca-certificates@3.1.0 (optional)
│ ├ paketo-buildpacks/bellsoft-liberica@9.2.0
│ ├ paketo-buildpacks/syft@1.10.0 (optional)
│ ├ paketo-buildpacks/leiningen@4.3.0 (optional)
│ ├ paketo-buildpacks/clojure-tools@2.3.0 (optional)
│ ├ paketo-buildpacks/gradle@6.4.1 (optional)
│ ├ paketo-buildpacks/maven@6.4.1 (optional)
│ ├ paketo-buildpacks/sbt@6.4.0 (optional)
│ ├ paketo-buildpacks/watchexec@2.3.3 (optional)
│ ├ paketo-buildpacks/executable-jar@6.1.0 (optional)
│ ├ paketo-buildpacks/apache-tomcat@7.2.0 (optional)
│ ├ paketo-buildpacks/dist-zip@5.2.0 (optional)
│ ├ paketo-buildpacks/spring-boot@5.8.0 (optional)
│ ├ paketo-buildpacks/procfile@5.1.0 (optional)
│ ├ paketo-buildpacks/jattach@1.0.0 (optional)
│ ├ paketo-buildpacks/azure-application-insights@5.3.2 (optional)
│ ├ paketo-buildpacks/google-stackdriver@5.6.1 (optional)
│ ├ paketo-buildpacks/java-memory-assistant@1.0.0 (optional)
│ ├ paketo-buildpacks/encrypt-at-rest@4.1.0 (optional)
│ ├ paketo-buildpacks/environment-variables@4.1.0 (optional)
│ └ paketo-buildpacks/image-labels@4.1.0 (optional)

Finally, we got to answer the question, “how on earth buildpack know if it’s a Java project specifically a maven project” Great. Before we close out this chapter, we should go back to DETECT and quickly check one thing. https://github.com/paketo-buildpacks/maven/blob/main/maven/detect.go Pay attention to the part where detect returns BuildPlans that has two parts, requiresand provides . Interestingly duty of DETECT doesn’t end at just choosing the right buildpacks. Each participating buildpacks return the build plan like Maven detect returns "requires PlanEntryJDK”. This makes sense as maven requires JDK to run. These plans from each participating buildpack will become the building blocks of the app image. Now it’s the right time to move on to Build .

Basics of Building Image

In layman’s terms, BUILD is to transform application source code into runnable artifacts that can be packaged into a container. There is an old saying “A picture is worth a thousand words”, and I regret I should have created a diagram for DETECT like the one below from the VMWare Tanzu team. It nicely depicts how we build runnable artifacts for the container image.

https://tanzu.vmware.com/developer/guides/cnb-what-is/

“build image” is from the builder, so it is straightforward but what is “run image”? Having two different images for build and run time makes sense as the builder image has unneeded build-related stuff that only bloats the image size for runtime. Each builder has its own choice of run image that you can see from the output of pack inspect .

$ pack builder inspect paketobuildpacks/builder:base
...
Run Images:
index.docker.io/paketobuildpacks/run:base-cnb
gcr.io/paketo-buildpacks/run:base-cnb
...

There is one thing missing in this picture, a cache image. If you have run pack build multiple times with the same Spring Boot project, you should be able to notice that pack doesn’t download the maven dependencies again after the initial build. Because buildpack caches .m2 and make use of it for a subsequent run. A cache can help pretty much any language with a dependency manager like npm, yarn, Cargo, NuGet, Maven, etc. By default, pack stores cache in local volumes.

$ docker volume ls
DRIVER VOLUME NAME
local 6a0f58cec851d7466935cd74966522c805fe76a2e4d7534d506a7aa76a7df2b9
local pack-cache-eggboy_demo_0.0.1-ad4549aaaa60.build
local pack-cache-eggboy_demo_0.0.1-ad4549aaaa60.launch
local pack-cache-eggboy_springboot_0.0.1-4afb4c557b06.build
local pack-cache-eggboy_springboot_0.0.1-4afb4c557b06.launch
local pack-cache-eggboy_workloadidentity-blob_0.0.1-bfc2e00b618a.build
local pack-cache-eggboy_workloadidentity-blob_0.0.1-bfc2e00b618a.launch

If you supply --cache-image with pack build, it will create a cache container image instead of volume. This should be extremely useful in the case of CI/CD pipeline where each runner or agent doesn’t have local volumes to cache. Using cache image forces --publish that will be pushing cache image to the container registry once EXPORT is completed.

$ pack build eggboy/springboot:0.0.1 --builder paketobuildpacks/builder:base --buildpack paketo-buildpacks/java-azure --cache-image eggboy/build-cache --publish
...
$ docker pull eggboy/build-cache
$ docker images | grep cache
eggboy/build-cache latest a5eac219cc27 42 years ago 491MB

The diagram above from the Tanzu team shows the app image is built with multiple layers, and that is a quite common practice when it comes to building container images to leverage the build cache.

ANALYZE and RESTORE

If you still remember, we have skipped ANALYZE and RESTORE earlier, and I feel this is the right time to double click on it. There is a nice picture from buildpack spec that describes the role of those.

https://github.com/buildpacks/spec/blob/main/buildpack.md#phase-2-analysis

In short, ANALYZE and RESTORE work hands in hands to decide which layers can be reused or should be replaced while building an app image. Let’s go back to the logs from pack . ANALYZE restores data from sbom(Software BOM) in the previous image. In RESTORE, it restores metadata from the app image and cache. In the logs, cache means local volume, and the app image is the previous image.

$ pack build eggboy/springboot:0.0.1 --builder paketobuildpacks/builder:base --buildpack paketo-buildpacks/java-azure
...
===> ANALYZING
Restoring data for sbom from previous image
===> DETECTING
...
===> RESTORING
Restoring metadata for "paketo-buildpacks/ca-certificates:helper" from app image
Restoring metadata for "paketo-buildpacks/microsoft-openjdk:helper" from app image
Restoring metadata for "paketo-buildpacks/microsoft-openjdk:java-security-properties" from app image
Restoring metadata for "paketo-buildpacks/microsoft-openjdk:jdk" from app image
Restoring metadata for "paketo-buildpacks/syft:syft" from cache
Restoring metadata for "paketo-buildpacks/maven:application" from cache
Restoring metadata for "paketo-buildpacks/maven:cache" from cache
Restoring metadata for "paketo-buildpacks/spring-boot:web-application-type" from app image
Restoring metadata for "paketo-buildpacks/spring-boot:helper" from app image
Restoring metadata for "paketo-buildpacks/spring-boot:spring-cloud-bindings" from app image
Restoring data for "paketo-buildpacks/microsoft-openjdk:jdk" from cache
Restoring data for "paketo-buildpacks/syft:syft" from cache
Restoring data for "paketo-buildpacks/maven:application" from cache
Restoring data for "paketo-buildpacks/maven:cache" from cache
Restoring data for sbom from cache

===> BUILDING
...
===> EXPORTING
...

Where is this sbom or metadata in the app image? The natural location for metadata to be placed would be the label of the image.

$ docker inspect eggboy/springboot:0.0.1
...
"io.buildpacks.build.metadata": "{\"bom\":[{\"name\":\"helper\",\"metadata\":{\"layer\":\"helper\",\"names\":[\"ca-certificates-helper\"],\"version\":\"3.1.0\"},\"buildpack\":{\"id\":\"paketo-buildpacks/ca-certificates\",\"version\":\"3.1.0\"}}
...
"io.buildpacks.lifecycle.metadata": "{\"app\":[{\"sha\":\"sha256:6a2a5287bdff8d4a260c591f15b215a755cbf0fcadebb974ea825d393322055d\"},{\"sha\":\"sha256:c4537747ad1efb1f098a319cad22cb13212d9b2c325c48ef9154078d4eeb8868\"},{\"sha\":\"sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef\"},{\"sha\":\"sha256:ff05d22892223e8145a7b3b8f4cd06ca8d5abd3562a1efa8f7a7c2ea2f48b27b\"},{\"sha\":\"sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef\"}],\"sbom\":{\"sha\":\"sha256:b9f29a2487d068cbe50c7fc56672d1f680efac5d81a5864203ae47b51363bf02\"},\"buildpacks\":[{\"key\":\"paketo-buildpacks/ca-certificates\",\"version\":\"3.1.0\",\"layers\":{\"helper\":{\"sha\":\"sha256:7a5c552506411ee7c48f73f8a72fe6e6eebba1aece6b9f41368009311bf93ebb\"
...
io.buildpacks.stack.mixins": "[\"adduser\",\"apt\",\"base-files\",\"base-passwd\",\"bash\",\"bsdutils\",\"bzip2\",\"ca-certificates\",\"coreutils\",\"dash\",\"debconf\",\"debianutils\",\"diffutils\",\"dpkg\",
...

io.buildpacks.lifecycle.metadata is the one influencing the ANALYZE and RESTORE for buildpacks with 3 layer types, launch, build, and cache . Depending on layer types combination, RESTORE will switch reading between metadata and cache. Buildpack spec offers a nice summary of combinations and expected behavior. https://github.com/buildpacks/spec/blob/main/buildpack.md#layer-types

EXPORT

Take a quick recap on how we have progressed so far. We saw how DETECT works with buildpack with buildpack groups in the builder, how ANALYZE and RESTORE reads metadata to present to BUILD which layers can be resued, and BUILD creates runnable artifacts of layers to be placed into the app image.

EXPORT is a final step to put together the layers produced by buildpacks in BUILD lifecycle on the run image, then update metadata in the labels to reflect the latest changes, and finally add cache with relevant artifacts like maven repository, application and etc.

===> EXPORTING
Adding layer 'paketo-buildpacks/ca-certificates:helper'
Adding layer 'paketo-buildpacks/microsoft-openjdk:helper'
Adding layer 'paketo-buildpacks/microsoft-openjdk:java-security-properties'
Adding layer 'paketo-buildpacks/microsoft-openjdk:jdk'
Adding layer 'paketo-buildpacks/executable-jar:classpath'
Adding layer 'paketo-buildpacks/spring-boot:helper'
Adding layer 'paketo-buildpacks/spring-boot:spring-cloud-bindings'
Adding layer 'paketo-buildpacks/spring-boot:web-application-type'
Adding layer 'launch.sbom'
Adding 5/5 app layer(s)
Adding layer 'launcher'
Adding layer 'config'
Adding layer 'process-types'
Adding label 'io.buildpacks.lifecycle.metadata'
Adding label 'io.buildpacks.build.metadata'
Adding label 'io.buildpacks.project.metadata'
Adding label 'org.opencontainers.image.title'
Adding label 'org.opencontainers.image.version'
Adding label 'org.springframework.boot.version'
Setting default process type 'web'
Saving eggboy/springboot:0.0.1...
*** Images (e922bf6caf97):
eggboy/springboot:0.0.1
Adding cache layer 'paketo-buildpacks/microsoft-openjdk:jdk'
Adding cache layer 'paketo-buildpacks/syft:syft'
Adding cache layer 'paketo-buildpacks/maven:application'
Adding cache layer 'paketo-buildpacks/maven:cache'
Adding cache layer 'cache.sbom'
Successfully built image 'eggboy/springboot:0.0.1'

Let’s look at the app image, and dive inside to see the different layers built by buildpacks.

$ docker images | grep spring
eggboy/springboot 0.0.1 e922bf6caf97 42 years ago 448MB
$ dive eggboy/springboot:0.0.1

But wait. Do you see something strange? The image is created “42 years ago”. There is a really good article that explains why. https://medium.com/buildpacks/time-travel-with-pack-e0efd8bf05db

dive again into a new app image, you should be able to relate participating buildpacks to actual layers in the image.

Layer by paketo-buildpacks/ca-certificates:helper

REBASE

Buildpacks modular approach enables an intriguing use case which is called REBASEand it is one of the CNB lifecycles that was not mentioned earlier. One image below explains nicely without a single word.

https://buildpacks.io/docs/concepts/operations/rebase/

I’m going to use the old image built by CNB and rebase it. I happen to have an old image on my docker hub, so I will rebase it. I will use trivy from Aqua Security which is an open-source tool I use almost every day.

$ docker pull eggboy/apidemo:0.0.1
$ trivy image eggboy/apidemo:0.0.1
2022-03-31T14:54:00.322+0800 INFO Detected OS: ubuntu
2022-03-31T14:54:00.322+0800 INFO Detecting Ubuntu vulnerabilities...
2022-03-31T14:54:00.325+0800 INFO Number of language-specific files: 6
2022-03-31T14:54:00.325+0800 INFO Detecting gobinary vulnerabilities...
2022-03-31T14:54:00.325+0800 INFO Detecting jar vulnerabilities...
eggboy/apidemo:0.0.1 (ubuntu 18.04)
===================================
Total: 72 (UNKNOWN: 0, LOW: 62, MEDIUM: 8, HIGH: 2, CRITICAL: 0)
...

It has a total of 72 issues with my old image. Now, I will rebase it to the latest run image.

$ pack rebase eggboy/apidemo:0.0.1
0.0.1: Pulling from eggboy/apidemo
Digest: sha256:04a9d0e5d3216877a6f2244650386a00ada7b387ba7b347ec60519f7c8f4edda
Status: Image is up to date for eggboy/apidemo:0.0.1
base-cnb: Pulling from paketobuildpacks/run
Digest: sha256:bcab6379bf83f0657dab49f08c7da7f23a6b18145352b2e13178e35cf6bd39c1
Status: Image is up to date for paketobuildpacks/run:base-cnb
Rebasing eggboy/apidemo:0.0.1 on run image index.docker.io/paketobuildpacks/run:base-cnb
Saving eggboy/apidemo:0.0.1...
*** Images (6db01a2b7103):
eggboy/apidemo:0.0.1
Rebased Image: 6db01a2b71037f939d2990a16a750d54f185cc11fda3c484f3f69246e398dab1
Successfully rebased image eggboy/apidemo:0.0.1
$ trivy image eggboy/apidemo:0.0.1
2022-03-31T14:59:00.770+0800 INFO Detected OS: ubuntu
2022-03-31T14:59:00.771+0800 INFO Detecting Ubuntu vulnerabilities...
2022-03-31T14:59:00.780+0800 INFO Number of language-specific files: 6
2022-03-31T14:59:00.781+0800 INFO Detecting gobinary vulnerabilities...
2022-03-31T14:59:00.782+0800 INFO Detecting jar vulnerabilities...
eggboy/apidemo:0.0.1 (ubuntu 18.04)
===================================
Total: 37 (UNKNOWN: 0, LOW: 32, MEDIUM: 5, HIGH: 0, CRITICAL: 0)
...

Rebased image to the latest run image reduced the total number of vulnerabilities from 72 to 37 without touching application codes. If you’re not happy with 37 and want to reduce it down to 0, you can use your craft to do it and then rebase your app image with CNB. You can supply your image with --run-image .

$ pack rebase eggboy/apidemo:0.0.1 --run-image [your run image]

Wrapping Up

We have touched on all the important lifecycles of Cloud Native Buildpack, and how buildpacks are put together under the hood. I hope you found it useful and well worth your time. In case you want to know more about the design specification of Cloud Native Buildpack, there is a Github repo https://github.com/buildpacks/spec/blob/main/buildpack.md

If you like my article, please leave some claps here or maybe even start following me. Thanks!

--

--

Jay Lee

Cloud Native Enthusiast. Java, Spring, Python, Golang, Kubernetes.