Distributed Config and Render Farm Setup

jspeer · September 3, 2019, 8:37pm

I have set up a project with a distributed config and included the shotgun python api in that config so that remote artist shouldn’t have to download anything. Ideally they just log in to SG, everything downloads to the bundle cache folder, and then they are up an running.

I am now trying to get shotgun working on our farm (Nuke) and was wondering how bootstrapping works with a distributed config on a farm. I’m basically trying to treat my render nodes as freelance artists so that they automatically download the bundle cache locally, launch in the shotgun environment, and then execute the render/publish. Has anyone done this, because I can’t seem to figure it out? I am interested in all the caveats that go along with this approach as well.

philip.scadding · September 4, 2019, 10:24am

Hi

Welcome to the forums and thanks for posting here!

So I’m actually working on a doc/guide on this very thing right now, though unfortunately, it’s not yet in a place where I can properly share it with you. However, I do have an earlier unreleased draft of a different doc I was working on which I can share.

I just though I’d point out that the tk-core (sgtk API) comes bundled with a copy of the Shotgun python API, and you can access it through the sgtk API. For example you can get an authenticated instance of Shotgun via the engine:

import sgtk

# get the engine we are currently running in
currentEngine = sgtk.platform.current_engine()
# get hold of the shotgun api instance used by the engine, (or we could have created a new one)
sg = currentEngine.shotgun

Here is an extract taken from one of my much earlier drafts on bootstrapping:

Distributed configs are handled a bit differently from centralized configs as you don’t necessarily have a config stored on disk for your project yet. The approach here is to use a standalone copy of the sgtk API to bootstrap into an engine.

The bootstrap process will take care of ensuring everything is cached locally and swap out the previously imported sgtk package for the one belonging to the project you’re bootstrapping.

(Note for this to work you need to have download a standalone copy of the sgtk API.)

The bootstrap process will start an engine, and you will want to pick the engine appropriately for the environment your running this script in. If you’re running the script in a python interpreter outside of some software like Maya or Nuke then the tk-shell engine serves this purpose nicely. Here is an example of how to do that:

(Note this also works for centralized configs.)

import sys
# import a standalone sgtk API instance, you don't need to insert the path if you pip installed the API
sys.path.insert(0,"/path/to/a/sgtk/api")

import sgtk

sa = sgtk.authentication.ShotgunAuthenticator()

# get pre cached user credentials
# user = sa.get_user()

# or authenticate using script credentials
user = sa.create_script_user(api_script="MYSCRIPTNAME",
                             api_key="MYSCRIPTKEY",
                             host="https://mysite.shotgunstudio.com")

sgtk.set_authenticated_user(user)

project = {"type": "Project", "id": 176}

mgr = sgtk.bootstrap.ToolkitManager(sg_user=user)
mgr.plugin_id = "basic."
# you don't need to specify a config, the bootstrap process will automatically pick one if you don't
mgr.pipeline_configuration = "dev"

engine = mgr.bootstrap_engine("tk-shell", entity=project)

# As we imported sgtk prior to bootstrapping we should import it again now as the bootstrap process swapped the standalone sgtk out for the project’s sgtk.
import sgtk

print ("engine",engine)
print ("context", engine.context)
print ("sgtk instance", engine.sgtk)
print ("Shotgun API instance", engine.shotgun)

If this is running on the farm I would recommend you use script credentials rather than user credentials, as obviously there won’t be a user present to log in.

I would recommend against bootstrapping in each render frame/task, as it will, (A) slow each render task down, and (B) potentially DDoS your site, if you have many workers running this simultaneously.

Instead, it’s better to have a pre or post job that can perform any Toolkit processing so that it is limited to once per farm job.

Let me know if you have any further questions!

Thanks
Phil

matt_ce · September 5, 2019, 3:24am

Looking forward to seeing that guide, it would be great to compare and contrast best practices vs what various studios have come up with to solve this issue.

On the script user topic, we want to replicate the user environment on the farm, and our farm is set up to run processes as the submitting user, so we’re passing along shotgun user info via a SHOTGUN_DESKTOP_CURRENT_USER environment variable, and then authenticating with it at bootstrap time, with:

# Authenticate using the supplied user.
serialized_user = os.environ['SHOTGUN_DESKTOP_CURRENT_USER']
user = sgtk.authentication.deserialize_user(serialized_user)

# Start up a Toolkit Manager with our authenticated user
mgr = sgtk.bootstrap.ToolkitManager(sg_user=user)

I’m not sure if this has unexpected negative consequences, but it’s been working fine for us so far.

philip.scadding · September 5, 2019, 9:24am

I’m not a 100% certain, but I imagine that it would fail if the job was left in the queue for too long as the session token would expire and it wouldn’t be able to authenticate without the user re-entering their user name and password.
Another approach you could take would be to use script authentication and then do something similar to what’s covered here:

Essentially the get_current_login.py hook could return back the SG user name that you could provide in an env var.

jspeer · September 9, 2019, 6:01pm

In regards to the python api coming with tk-core is there another place to call the code below from? Or a better way?

import shotgun_api3
self.sg = shotgun_api3.Shotgun(“sgwebsite”, login=“user”, password=“pw”)

I was using this in before_app_launch.py to eventually query the website for my software fields.

philip.scadding · September 10, 2019, 8:12am

If you’re in the before_app_launch.py hook, then you can access an already authenticated Shotgun object via

self.sg = self.parent.shotgun

self.parent will return the App, and the App has a shotgun property..

However, if you’re wanting to create a new Shotgun API instance, perhaps because you want different permissions than the currently authenticated SG instance then you could do the following:

from tank_vendor import shotgun_api3
self.sg = shotgun_api3.Shotgun(“sgwebsite”, script_name =“test”, api_key=“asdfhqwery2349hciauy43”)

Which will import the shotgun API from the tk-core.

jspeer · September 16, 2019, 9:54pm

I am finally getting a chance to test this stuff out, and am I struggling a bit. I have used your code from above to create a tk-nuke engine on a render node. I am calling it using the nuke command line and feeding it the python script from above with a few tweaks for the project and configuration as well as tacking on some Nuke commands to open the script and render a frame. Below is my example nuke command.

/usr/local/Nuke11.1v2/Nuke11.1 -t /path/shotgun_bootstrap.py /path/to/nuke/script.nk

The output when running this is what I expect. I can see that the engine is starting and pulling the config down to the render node. However, it dies when trying to write the frame with this error:

ShotgunWrite2: 'WriteTank': unknown command. This is most likely from a corrupt .nk file, or from a missing or unlicensed plug-in. Can't render from that Node.

I am guessing this is because I am missing the part on how to launch a DCC properly (with environment) once the bootstrap process has been completed. Can you point me in the right direction for what I am missing?

philip.scadding · September 17, 2019, 8:16am

Awesome sounds like you’re on the right path, and yes I think you’re right. The code I provided above will start the engine in a project context. Your config is most likely configured not to have the Shotgun write node present in the project context, so you would need to switch to an appropriate context for the file.

After bootstrap I wonder if you could do something like:

new_context = engine.sgtk.context_from_path(current_script_path)
engine.change_context(new_context)

Or if you can pass the context up front to the job then even better. You could either pass a serialised context, or just a Task entity id and then sgtk.context_from_entity("Task", entity_id)

Patrick · October 4, 2019, 1:19pm

I’m curious about the potential for actually launching apps on the renderfarm using the launchapp app. This would avoid having to duplicate any logic from our launchapp hooks within a deadline prejobload method (at least that would be the aim).
Is this something that would make sense? And if so, how would it be achieved? Would it conflict with deadline’s own launch method for loading apps? (eg having the appa as a “complex” plugin that allows client/server type communication between the dcc and deadline).
It would certainly make for an interesting alternative to bootstrapping the engine once an app as actually loaded.

matt_ce · October 10, 2019, 7:05am

Hi Patrick, I had some similar questions, Philip gave some ideas in this thread:
https://community.shotgridsoftware.com/t/launching-an-application-via-launchapp-from-the-command-line/

I do agree that it would be really helpful if there was a nice simple way of replicating launchapp’s functionality on the command line.

philip.scadding · October 10, 2019, 8:27am

Using a distributed config setup you would need your farm job to run a script that bootstrapped the tk-shell engine and ran the launch app for the software of choice.
Not complete code but something like this:

...
engine = mgr.bootstrap_engine( "tk-shell", entity={"type":"Project", "id":123} )
engine.execute_command( "houdini_fx_17.5.360", [] )

Though I don’t know how that would work with for example Deadline, and making sure the software is launched with everything it needs.

Dfulton1 · October 11, 2019, 7:32pm

Hello,

Question is there a command in the

engine.execute_command()

to launch mayabatch or mayapy in order to launch in to mayastandalone into maya to run a process on the farm

I have also tried to engine = mgr.bootstrap_engine("tk-maya", entity=project)
but it returns an error importing maya.OpenMaya

philip.scadding · October 14, 2019, 9:26am

The engine.execute_command() will execute commands that a Toolkit app has registered. The available Toolkit apps are configured via the environment yaml files.
The tk-multi-launchapp is responsible for launching software. It registers commands for each Software entity it finds in your Shotgun site. You could create a Software entity on your Shotgun site for mayabatch, make sure you set the path on the software entity (and args fields if required)

The bootstrap method is used for starting the engine, but the engine is not responsible for launching the software. So if you tried to bootstrap the tk-maya engine outside of Maya then it won’t launch Maya and would fail because it can’t import the Maya API. You need to launch the software first and then bootstrap.

What Patrick was suggesting is:

Bootstrap shell engine in standalone python.
Run the launch Maya command through the engine.
The tk-multi-launchapp will then launch Maya
The startup script provided by the tk-multi-launchapp bootstraps the tk-maya engine.

Otherwise, you can:

Start Maya not via Toolkit.
Run your own bootstrap script to start the tk-maya engine.

Dfulton1 · October 23, 2019, 9:51pm

Hello again,

quick question is there a way to actually pass in an argument through the

engine.execute_command()

or do you have to put the argument in the software entity on the web page it seems like nuke version of the command excepts them all as a string but no the maya command.

jspeer · October 24, 2019, 2:51pm

Philip is that guide you were working on posted on the docs now?

philip.scadding · October 25, 2019, 8:31am

The execute_command method accepts a list of args as it’s second parameter, however these are the args for the Toolkit app you are launching. So if you are running houdini_fx_17.5.360 for example then it actually running the tk-multi-launchapp ~~which doesn’t require any args~~ which will accept a scene file path as an arg, which it will try to open in the launched software.

If you want to pass args to the software that the launchapp will be starting, then putting the args on the software entity is the right way to go. If these need to be dynamically built, then the before_app_launch.py hook is what you need.

Please, can you give an example? I’m not quite sure what you’re seeing there.

Sadly not, I’m sorry. But I promise you it’s not forgotten about, I am still working away at it, I just haven’t been able to dedicate enough time to it recently.

Dfulton1 · October 25, 2019, 2:24pm

It seems like in the execute_command(“maya_2018”, [this is where the file to open happens]) the second argument for maya get passed as the SGTK_FILE_TO_OPEN. Where as execute_command(“nuke_v11.2”, [this is an argument]) as far as I can tell this is happening in tk-multi_launchapp prepare_apps.py

# Prep any application specific things now that we know we don't
# have an engine-specific bootstrap to use.
if engine_name == "tk-maya":
    _prepare_maya_launch()
elif engine_name == "tk-softimage":
    _prepare_softimage_launch()
elif engine_name == "tk-motionbuilder":
    app_args = _prepare_motionbuilder_launch(app_args)
elif engine_name == "tk-3dsmax":
    app_args = _prepare_3dsmax_launch(app_args)
elif engine_name == "tk-3dsmaxplus":
    app_args = _prepare_3dsmaxplus_launch(context, app_args, app_path)
elif engine_name == "tk-photoshop":
    _prepare_photoshop_launch(context)
elif engine_name == "tk-houdini":
    _prepare_houdini_launch(context)
elif engine_name == "tk-mari":
    _prepare_mari_launch(engine_name, context)
elif engine_name in ["tk-flame", "tk-flare"]:
    (app_path, app_args) = _prepare_flame_flare_launch(
        engine_name, context, app_path, app_args,
    )
else:
    # This should really be the first thing we try, but some of
    # the engines (like tk-3dsmaxplus, as an example) have bootstrapping
    # logic that doesn't properly stand alone and still requires
    # their "prepare" method here in launchapp to be run. As engines
    # are updated to handle all their own bootstrapping, they can be
    # pulled out from above and moved into the except block below in
    # the way that tk-nuke and tk-hiero have.
    try:
        (app_path, app_args) = _prepare_generic_launch(
            tk_app, engine_name, context, app_path, app_args,
        )
    except TankBootstrapNotFoundError:
        # Backwards compatibility here for earlier engine versions.
        if engine_name == "tk-nuke":
            app_args = _prepare_nuke_launch(file_to_open, app_args)
        elif engine_name == "tk-hiero":
            _prepare_hiero_launch()
        else:
            # We have neither an engine-specific nor launchapp-specific
            # bootstrap for this engine, so we have to bail out here.
            raise TankError(
                "No bootstrap routine found for %s. The engine will not be started." %
                (engine_name)
            )

From what I can tell so far the majority of the issue seem to be coming from the distributed config but there doesn’t seem to be to much accessible information on this. Most of the stuff I’ve found either seems a bit dated or is for the centralized distribution.

Another thing I tried was bootstrapping the tk-shell then just straight loading tk-maya sgtk.platform.start_engine(“tk-maya”, tk, ctx) the problem I run into is that it will not let me run tk = sgtk.sgtk_from_path(“/my/project/root”) is keep getting this error.

tank.errors.TankInitError: You are loading the Toolkit platform from the pipeline configuration located in ‘/home/dfulton/private/.shotgun/flightschool/p94.basic./cfg’, with Shotgun id None. You are trying to initialize Toolkit from Project 94, however that is not associated with the pipeline configuration. Instead, it’s associated with the following configurations: ‘None’ (Pipeline config id 17, Project id None), ‘None’ (Pipeline config id 18, Project id None), ‘None’ (Pipeline config id 20, Project id None), ‘None’ (Pipeline config id 21, Project id None), ‘None’ (Pipeline config id 22, Project id None).

Also I can Launch Maya or Nuke with the execute_command(“software”, [‘’]) but it seems to not load shotgun with it which probably mean that it can load the sgtk. not really sure where to go from here.

philip.scadding · October 25, 2019, 2:59pm

Yes, that is true actually, you can pass a file as an argument to the app (I’ll update my original message) thanks for pointing that out, however you can’t pass arbitrary software args using this method.

I’m not sure what issue you are running into? Are you saying that when you run the launch app command it starts the software but not the Shotgun integration?

Dfulton1 · October 25, 2019, 3:10pm

Yes when I start the software it does not load the shotgun integration. Also I’ m just trying to run a process such as baking an animation rig through maya command line and have it publish the results to shotgun, currently I’m trying to do this locally eventually Id like to have it do this on the farm.

philip.scadding · October 25, 2019, 3:31pm

OK, so just to confirm the steps.

You have a script that is run in python, that bootstraps the tk-shell engine and then executes the launch maya command (via the tk-multi-launchapp). Maya starts but the Shotgun menu doesn’t appear.

Are you able to launch it OK via the Shotgun Desktop app (I’m guessing so)?
Do you get any errors in the tk-maya.log file?
Do you get any errors in the Maya script editor?
I would print out all the environment variables in your launched Maya session, from both Desktop (if that works) and from your custom launch process, and cross compare them, to make sure they are the same.

My bet here is that either something is erroring in the Maya bootstrap process, or the maya usersetup.py file is not being run at all.