Publish Dependency Sanity Check

Ricardo_Musch · May 13, 2020, 4:43pm

Hiya,

I’m advancing our publish pipeline and also looking more closely at file deletion.

Just wanted to sanity check the usage of “downstream_published_files” and “upstream_published_files”

For example:
Am I correct that if I publish an EXR sequence from Nuke, my Nuke script is a upstream Publish Dependency BUT all of the other read nodes in my script are also upstream dependencies?
Because I need those to make my render.

Just sanity checking, maybe quarantine is driving me slightly mad

Ben_xzj · May 14, 2020, 2:28am

Hi @Ricardo_Musch,

Thanks for posting.

Sorry I’m not sure of your user story, but the upstream and downstream fields will get populated if you reference in a different published file, and then publish your current scene.

For example, if you referenced in pub_file_a (using the loader app) into your scene and then publish your current scene ( pub_file_b ), then in Shotgun, pub_file_a will have a downstream link to pub_file_b , and pub_file_b will have an upstream reference to pub_file_a .

Hope this helps.

Cheers,
Ben

Ricardo_Musch · May 14, 2020, 3:19am

Ah yeah, was just sanity checking the tree as you will.

Follow up question:
What is the best way to query the dependencies to find out if a published file is used in a version and what that version’s status is?

Ideally I would like to do some file archiving/deletion and it would be a good way to figure tha out trough the published files as not everything we render becomes a version.

For example:
I would like to identify published files that are attached to versions (or to downstream published files which are attached to versions) where the version status is set to a particular status.

How would a query like that work?

Ben_xzj · May 14, 2020, 4:57am

Hi Ricardo,

There is a Version field on the Published Files entity, which records if the published file is linked to a version. You could filter on the Version field, as well as Version->Status field to filter out published files.

Is this what you are looking for? If so, since you could filter on Web UI, I think you could query via Shotgun API sg.find with filter conditions applied (though I do not exactly know how to script). hahh…

Cheers,
Ben

mmoshev · May 14, 2020, 6:24am

This would depend on the publish plugin’s behavior. I haven’t published image sequences from Nuke, so not sure if the default hooks set upstream publishes as you describe. This would require a bit of digging.

Otherwise yes, the intention would be that you can delete upstream files if they are not needed by other publishes. It’s a nice idea to look at the versions when deciding whether something should be deleted.

Incidentally, we are looking at implementing cleanup right now, too. Haven’t started yet. And the more we automate the rest of the pipeline, the more convenient this cleanup becomes.

Ricardo_Musch · May 16, 2020, 11:01pm

I won’t only be looking at versions since we don’t tend to create versions for everything we render (example: precomps, test renders, some denoise renders, etc)

What I have done is the following:

Create a on-render callback for Nuke which registers a publishFile in Shotgun every time someone clicks the render button. This way we register every rendered nuke file, even if it doesn’t end up in a version.
From personal experience I know that compers like to press the “save new version” button or try things out. So there are usually a lot of renders left on disk that are never published or tracked.
At least now they are tracked and we could delete them if needed.
I changed the way the publisher registers what dependencies are created for a nuke script publish. The default Shotgun way simply tracks all Read nodes and this is a bit simplistic and doesn’t help with cleanup.

I register all Read, ReadGeo, DeepRead and Camera nodes for the “file” knob and then check if those nodes actually have any relevance to the script.
You can check with nuke to see if your node actually has nodes that depend on it in the script:

nuke.dependentNodes(nuke.INPUTS | nuke.HIDDEN_INPUTS | nuke.EXPRESSIONS, node)
Returns a list of nodes that depend on this node

Reason for this is that in comp workflow you generally also load things like previous version renders into your script to wipe between.
Shotgun’s default implementation would register those published files as dependencies even though they aren’t actually needed to rebuild your render output.

I changed the way published files are created as I don’t like Shotgun’s way of registering a new one each time. If the file is in the same place (the path is the same) I see no reason not to just update the publishedFile record instead of creating yet another duplicate record.

mmoshev · May 18, 2020, 7:21am

(Haha we seem to be doing a lot of similarly-themed work)

In my mind, conceptually, Versions have less weight than Publishes, i.e. you have a lot of versions of something, and when you like it, you publish the scene so others can use it downstream, or whatever. If I understand it correctly, you have it the other way around?

You system sounds sane. The render depends on having the scene, and all other inputs.
There is a question whether a single scene publish should be used for multiple renders, or a new publish every time you render - it’s a matter of deciding on the identity of things.

In the first case, you should be careful with cleanup, if other renders might also depend on the same scene. However, it is more economical, if you worry about accumulating publishes (and maybe files on disk).

Ricardo_Musch · May 18, 2020, 9:47am

Hmm, versions are the breath of our workflow though, each version status is carefully tracked.
If it is sent to client the proper fields are filled which means that a version sent to client could never have it’s dependencies deleted.

But I don’t think we do things the other way round.
Usually in a comp workflow I want my artists to check their exr renders before publishing.
And for comp we only really publish comp renders atm, so any other renders in the script don’t get a daily rendered and no version attached. (Example: Denoise, precomps, element precomps, etc)
They still need to be tracked, especially for cleanup purposes as artists tend to make a mess and not care about space.
Especially compers tend to render exorbitant amount of data that is not used…

Not entirely sure what you mean here.
I’ve just made sure that we don’t create duplicate published files as I see no need for that, only one file can live in that location on disk, dependencies upstream and downstream are added as we go.

mmoshev · May 22, 2020, 6:30am

We are kinda moving in the same direction, but atm we do not publish files for each version that will be rendered. One reason for this is the lack of automatic cleanup.

A publish is meant to be like a checkpoint, a fixed state that you can load later, and won’t change.
This is usually for downstream use, e.g. by other departments. That is why you want a unique file, which will not be touched anymore.

It warrants an entire separate discussion how to handle the versioning of files. For instance, do publishes follow the version line of scenes, or have their own? There are arguments for both.

One of the tradeoffs is exactly accumulating a lot of data on disk, vs. having guaranteed fixed points in the work process.

Shotgun’s default implementation would register those published files as dependencies even though they aren’t actually needed to rebuild your render output.

But the point is that they were needed to produce the render, not that you need them to view it.

I’ve just made sure that we don’t create duplicate published files as I see no need for that, only one file can live in that location on disk, dependencies upstream and downstream are added as we go.

I’m probably missing some points. The default publish plugins do check if there are conflicting publishes (and maybe do not let you publish unless you raise the version number, don’t remember).
It makes sense to update an existing publish, if you want to prevent a lot of data on disk.
This does invalidate other artifacts (e.g. Versions) that depended on that publish, since it might not represent the same state anymore. In practice it might not matter, though.

Ricardo_Musch · May 22, 2020, 10:19am

Nope, using the line I posted about Nuke it will return all nodes that are attached to something in the script, what it will ignore is read nodes floating around without anything attached.
In a comp workflow these are mostly previous renderred publishes that are hanging around under the final output write because compers like to wipe between their last publish and their new publish.
It doesn’t mean that those renders need to stay on disk forever because they where not needed to create that publish, they are simply there for reference. I’ve worked long enough as a compositor to know that this could easily be 20 to 30 read nodes of old versions just sitting there in the script… not useful for anything. T
Therefore they should not be registered as a dependency.

I’ve just made sure that we don’t create duplicate published files as I see no need for that, only one file can live in that location on disk, dependencies upstream and downstream are added as we go.

The default implementation doesn’t allow publishes of scenefiles with the same version (technically overwriting the scene file which you wouldnt want to do).
It does however allow publishing duplicates of renderred files.
The publish hook simply sets the previous published file status to “N/A”.
Which creates duplicates.

I’m also not advocating to publish over existing files, however I have added callbacks for nuke’s render system to register a published file the moment someone starts a render. Because like I said before, compers render many times and sometimes version up leaving an orphaned file sequence sitting on disk, never to be used. Since It’s now registered and never attached to a version or upstream/downstream files it can safely be removed at some point.
If it is published it will be updated instead of a duplicate being registered.

mmoshev · May 22, 2020, 12:01pm

what it will ignore is read nodes floating around without anything attached

Right, I had missed this point earlier. So, all is good, I think we dug into deeper detail for your use case

I have little experience in vfx (mostly a programmer nowadays), so I do have fuzzy spots in the workflow.