Below are the notes collected from the Siggraph 2023 Birds of a Feather event on pipelines. The BoF space was organized into topic groups where attendees could move between topics and discuss as they wished. As the organizer, I asked folks in each group to try and keep notes as much as possible. Below is the result of that process.

These notes ask as many questions as they answer, follow different styles and are often terse. However, they also provide interesting insight into the topics provided and what was being discussed by each group.

Notes by Topic

Observable Pipelines

Event bus pipelines

E.g., render complete event
Downstream tasks subscribed to events.
What happens when the system goes down?

Tool use metrics?

Problems at scale?
How granularly to track?

Artist comprehension of

Asset model
Data flow

“Unit tests” for assets

Node-based event graph for tasks “Temerity” (Jim Callahan)

How to show/hide parts of the whole that the artists should(n’t) see?

Event bus

Can “rerun” events to redo work
E.g. LUTS changed -> Regenerate all reviewables

Vs Pre-Calc tasks

No spook action at a distance
How to re-run a part of a graph?

File Logs vs Logging service?

Issue reporting tool - report error w/context

Local deque of user action events for “what were you doing?”

OpenTelemetry?

Poll: What do you like to use for tracing / metrics / log viewing?

Datadog: 1 Grafana: 4 Jaeger: 1 ElasticSearch: 1

Can we track a session of events and tag it when exceptions happen?

Tags are good
Track all the events?
How do you avoid serial aspect of tracking everything
Production management side of observable
How do you indicate that a step is complete from an artist and pipeline perspective?
Notification of upstream / downstream changes
Web-based tools and latency compared to Qt

Multi-Site Data Pipelines

Notes by Tom Cowland and Jason Scott

Topics

External data management
Multi-vendor with multiple pipelines
Validation
Ingestion
Manage the trust boundary
Studios with multiple brands and multiple locations
Can “universal” pipelines work?
More likely to succeed with multi-vendors if there are contracts at the boundary.
“Skynet” (singular pipeline from a studio to all other vendors) precludes local innovation and iteration.
Use USD with URIs (vs. paths) across multiple pipelines
Desired direction
DDoS can be managed
Sidecar/pre-cache farm files can facilitate sharing

Challenges

Data syncing is hard
Knowing full closure of everything that needs to be synced
Understanding transfer latency on the production process
Database syncing / replication vs. a single database
Many use single 3rd-party databases (ShotGrid, ftrack), and the latency for these doesn’t seem to be one of the larger issues that studios are dealing with
Multi-site databases require conflict management (versions, etc.)
Visibility of location and disk space for transfer costs
Filesystem replication (many people interested in the different solutions out there, but no one in the BoF group is using in production yet)
Only using on-demand NFS reads can stall production too much
Still need to pre-warm some transfers
A multi-vendor shared file system is just as hard as a “universal” pipeline
Control is important when considering commercial offerings for sync (vs. cost overheads)
Latency of file streaming

Questions to consider

Thick vs. thin clients
Single global-centralized data center (even cloud) with thin clients doesn’t work in all locations due to latency
Many studios use on-site thin clients for consistency of workflow (with those not on-site) and flexibility
For some studios, do individuals act as “sites” in the same way that full data centers do?
Cloud workstations vs. data centers
Types of transfer methods
rsync (with a tarball wrapper)
ZFS
rclone
Transfer priority systems
Most use the existing farm queue manager as a transfer priority manager
Visibility for production
Don’t forget non-technical solutions (workflow design or social engineering that might help manage expectations or usages of data)
Give outsource companies a metadata path specification (a JSON-file mapping path fields with intended usage)
Many companies supply a document explaining path specification, but a JSON representation could be helpful
To avoid version number conflict, use non-numerical versions to allow local ID generation

Recommendations

Transfer systems:

JetStream (https://gojetstream.io/)

Databases:

Apache Cassandra (https://cassandra.apache.org/)
Couchbase (https://www.couchbase.com/)

Infrastructure Management

Notes by Joe D’Amato

Remote Problems

All extra tech support

Dealing with onsite tech setup
Virtual vs 1:1
Negative: How do we flip to GPU farm, Strict machine profile, Licensing
Positive: Fine-tuned configs,

Render Farm

1 job / machine vs slot packing
USD How to move data & I/O patterns
Prioritization schemas are broken
Production: Short shows have no downtime, Limits of Qube
Always use test jobs
Mandating jobs not pick up until test
T-Shirt size for job types cores / mem
Silo resources for security

Liquid cooling

Closed loop
External cooling
Using HVAC
Immersive racks

Quantum compute

5–7 years
Near instantaneous

GPU

50% faster each year
How do you buy more now, knowing that?

What Is / Is Not Pipeline?

Pipeline = transport, conformed metadata

Workflow(s) = artist-specific, able to shape the data to enter and move down the pipe

What is pipeline:

Management of data between defined steps

Efficiency vs sustainability of workflows

Social level (workers)
Environmental (power, resources)
Equipment (hardware & software, licenses)
Are they exclusive?

Pipeline as reactive to artists vs workflows vs keeping an overall structure (fixed)

What is Pipeline?

The defined process of pushing content through the steps of production
Workflows get the content organized to get processed by the pipeline
I think…

Pipelines are

Flexible
Supports the goals of the production
Invisible / transparent as possible

What is NOT a pipeline?

How do you perform the art?
Production tracking?

Pipeline can be the resulting workflow when you successfully get from start to finish

The order of “jobs” that need to be done to get from point A to point B

Pipeline is the glue that connects workflows and gives output files

Define the rules - how data has to be saved and used

What is Pipeline?

Standards for a step

Python at Scale

Python typing is essential for any project at scale

Mypy has a way to compile fully typed Python code to C for high-performance

Deploy with Kubernetes

asyncio looks promising but increases complexity

GIL (global interpreter lock in CPython) (PEP 703)
worse than py2 -> py3 conversion
shims don’t fix a table (eg. thread is short)

Spack (used by supercomputing communities) and rez for system library package management

To Package (Software) or Not to Package

devpy & artifactory

Hda’s, nuke gizmos

How much on base image, how much packaged?

UI standards

Packaging for remote artists

Packaging for remote studios

Rez

How to package software

What happens

How does config management fall into package management

Puppet

pyc files

Overlay FS

CI/CD

Packaging nuke gizmos, hda’s

Rebuilding regularly to ensure that the recipe works

Do you need to pull to get new builds or can the system rebuild from another recipe even if the version is not the same as the file

Target server and hope

pdiff to ensure consistency

Asset vs package

If an artist may want to tinker, an asset would make more sense

Lipo universal binaries

pdq deploy

chocolatey

homebrew

conda recipes

conan -> cmake

Pipeline TD: Career or Stepping Stone

Better IC (individual contributor) growth paths

Collaboration with core engineering groups

Making sure the TDs keep a close partnership with RnD to make sure there is respect

Making sure there is a clear career path for IC’s to continue to grow without needing to go into management

Final Thoughts

I hope that these notes at least server as a small record of the BoF session. Many great discussion were had and new connections were made, even if we didn’t do a great job at capturing it all.

Thanks for reading!

Notes from Global CG Pipelines BoFThursday, Aug 10, 2023