Alfred 6.5 Release Notes

Alfred 6.5 Release Notes
Enhancements Bug Fixes Additional Changes for 6.5.1 Additional Changes for 6.5.2

Enhancements

Codename Project BatCave
Alfred MySQL Logging and Web-based Database Browsing

The project codenamed "BatCave" is an effort to export Alfred status and task execution history to a generic database in a form which allows people to develop monitoring and analysis tools which are specific to their needs and interests.
Alfred components, such as the dispatcher, maitre-d, and alfserver have been modified to (optionally) log various data and events directly to a MySQL database.
The database provides a highly programmable environment for creating custom status queries and for doing historical analysis. The database server can also offload the live status reporting/formatting function performed by the maitre-d in "watch-servers" mode, which can improve the maitre-d performance at large sites.
A php-based web interface for viewing the records in this SQL database is also shipped as part of the project. The web interface provides a broad range of basic functions and is fully customizable, so new site-specific queries and reports can easily be added.
Several new alfred.ini settings control the basic database logging capability. Bootstrap scripts are also provided for creating the database tables and the web interface.
This capability has broad implications for integrating Alfred with existing production databases, as well as with site control and planning projects. Alfred features in this area will continue to grow and evolve from the initial groundwork being released here.
For more details, see the Project BatCave documentation.

Your job scripts can update MySQL records too -- In addition to the built-in Alfred data logging, Alfred scripts can also add or modify records in the database. One approach is to simply have the one of the job tasks invoke a command-line application that makes the desired updates; this could be a custom application that makes use of the MySQL API, or it could be a direct or indirect invocation of the 'mysql' command-line client which comes with MySQL. There are also new Alfred scripting options, Job -sqlset and Task -sqlset which can update the BatCave tables directly. The typical usage would involve creating one or more site-specific columns in the existing Job or Task tables; these custom columns would then be updated with values from the job script when the corresponding Job or Task record is created. One generic column called "jobgroup" is already provided in the default Job table, it is intended for arbitrary site use in this way. For example:
Job -title "my job" \
	-sqlset {jobgroup='shot-c57a'} \
	-subtasks ...

New Cmd Substitutions

The Cmd/RemoteCmd "launch expression" which defines the command-line to be executed may contain several special "%" substitution macros that are expanded by Alfred before the command is launched. There are substitutions for the names of bound servers, etc. A few new ones have been added which are especially useful when dealing with the batcave databases:

%J expands to the batcave MySQL Job "jid" for the current job.
%j expands to the internal dispatcher job-id for the current job. Note that these are not globally unique.
%t expands to the Task "tid" for the current task. While not globally unique, it is unique within the job, and is used both internally by Alfred and in the batcave tables (as "tid").
%c expands to the batcave MySQL Cmd "cmdid" for the current Cmd/RemoteCmd.
So an example usage might be the following pointless, and somewhat recursive, command:
Cmd {mysql -h host -u user -D db \
      -e {select commandline from Cmd where jid=%J \
          and tid=%t and cmdid=%c}}

New and Different settings in alfred.ini

Alfred administrators are encouraged to browse through the new alfred.ini and perhaps "diff" it with the one currently in use. There have been several changes and additions which may be relevant. For example, there is now a way to limit the number of output log records retained on disk for each task in a job; this can help reduce log file sizes when tasks have "run-away" diagnostics (like prints in shaders).

Enable/Disable the "Watch Servers" and "Master Schedule" menu items

There are new configuration settings in alfred.ini for controlling whether the Watch Servers and Master Schedule menu items are enabled or disabled in the Alfred user interface. Large sites may want to consider disabling Watch Servers in particular since it can add considerable load to the maitre-d process, when there are a lot of servers to monitor. Also, many of the Project BatCave features (above) are intended to provide similar or improved functionality.

Unicast Metrics

Alfserver and the Alfred maitre-d "discover" each other on the network using multicast packets addressed to a particular multicast "session" address.
Having found each other, alfservers then deliver periodic status updates called metrics to the maitre-d (and now also to a MySQL database, see the BatCave discussion above). These metrics are used as a basic measure of server health and the values can be used to make specific server assignment decisions. Starting with Alfred 6.5, metrics are reported to the maitre-d using point-to-point "unicast" udp packets. In previous releases the metrics were also multicast back to the discovery address, for use by potentially many interested listeners. The unicast approach can reduce some network overhead at large sites, especially in situations where the one-to-many nature of multicast traffic causes problems for smart network switches that try to optimize one-to-one communications.
Routers on the network ensure that the mulicast discovery messages are delivered to all "subscribed" systems. By default, Alfred and alfserver use the multicast "session" address 239.255.224.99, port 9002/udp. Sites can change this multicast address by adding the hostname "alf-status" to the site nameserver (e.g. DNS, NIS, /etc/hosts, etc), and picking a new multicast address for it from the multicast range (224.0.0.0 - 239.255.255.255). Note that there are IANA numbering conventions which apply to multicast addresses.
Alternatively, conventional "unicast" communications can be used for both discovery as well as metrics delivery. This is done by simply adding "alf-status" as a hostname alias for the maitre-d host's regular IP address, rather than using a multicast address. This approach is actually a way to bypass the "discovery" phase. The alfserver metrics will be sent as standard UDP packets directly to the named maitre-d. Note that this approach should not be used with fallback maitre-ds, since alfservers would only know about the one named host, and metrics would only be delivered to that one host.
A new alfserver configuration setting, "metricsDelivery" can now be set to "multicast" to force metrics to be sent to the multicast address (as in releases prior to 6.5), so that any other interested listeners can receive them simultaneously.
There is also a new way to deliver configuration overrides to all alfservers from the maitre-d: Create a file called $RATTREE/etc/alfsite.ini containing the overrides in an ini location accessible to the maitre-d, its contents will be sent by the maitre-d to the alfservers as part of the discovery process, along with the site metrics definitions.

New Task Menu Item: Try this task next

There is a new item that appears on the Task menu when you click on a particular task's box in the job diagram window. This new entry allows you to request that the given task should be dispatched next, if possible. This simply changes the local dispatcher's &next task& logic and does not affect the actual job priorities relative to jobs from other dispatchers. This action is only available on tasks that are "Ready" to execute.

New Alfred Assigner Code

The "inner loop" of the maitre-d server assignment algorithm has been changed. The new code is both more uniform (there's only one assignment entry point), and more accurate in the face of a wide ranging mix of incoming request types and frequencies.
The assigner code is also provided in source-code form, as has been true in prior releases, so sites can create an "assigner plug-in" that implements an alternative set of policies.
Note: there is currently no backward compatibility support for assigner plug-ins written for prior releases. Sites that have such plug-ins will need to port the relevant changes to the new algorithm. The existence of old plug-ins will not cause errors, since the maitre-d will just fall back to using the default built-in scheme. It distinguishes new plug-ins by searching for the new, versioned, name of the assigner object factory method.

Bug Fixes

Improved the handling of assignment requests, see above.
Improved the load-balancing among jobs on the local dispatching queue when in "job parallel" mode.
Connections to remote Alfred dispatchers, using "alfred -h user@host", now use the site maitre-d, if available, to determine the remote connection port. The prior use of 'rsh' for this purpose, while nominally somewhat more secure, added unnecessary complexity at most sites and is increasingly unlikely to work as rsh support dwindles. A new ini setting (rshForDispatcherDiscovery) can be used to restore the old behavior.
Certain "alfserver not responding" situations are now handled more correctly, and those servers are more consistently taken out of the assignment pool for a period specified by "timerAvoidNoListener".
A bug was fixed in the handling of Alfred "maitredHost" lists in configuration files other than $RATTREE/etc/alfred.ini, such as found via $RAT_SCRIPT_PATHS. If the primary maitre-d went offline, dispatchers using the alternate configuration file locations would sometimes end up in "chaos mode" (using a private, local, maitre-d), and then be unable to reconnect to the main maitre-d when it came back online.
Fetching task output logs with lines longer than 1024 characters sometimes failed due to faulty encoding for transmission. This has been fixed.
Support for handling log files greater than 2GB in size has been enabled on Linux systems. This should fix problems loading existing job checkpoints for large jobs, and address crashes or other misbehavior when logs grew about 2GB.
A new alfred.ini setting, "maxTaskOutput", limits the number of records logged on a per-task basis. Some problems with large task output logs can be avoided if no individual task is allowed to log more than 5000 records, for example.
Upon receipt of SIGHUP on unix-style systems, the Alfred dispatcher and maitre-d now explicitly close and reopen their diagnostic log files (as specified by the "-log filename" command-line option). This should allow them to interoperate better with log rotation facilities such as logrotate on Linux systems.
Fixed some cases where paths containing blank spaces would cause problems for launching certain applications.
Fixed several problems involving Alfserver's handling of RMANCONFIG.
Better temporary file names for Alfred jobs and logs are now chosen, to prevent occasional collisions on some systems.
An issue with retrying preflight tasks in a job that caused Alfred to crash has been fixed.
Several problems related to "skipping" subtasks of a "shared server" parent task have been fixed.
The timerMaitredQueue setting from alfred.ini is now obeyed properly.
Alfred's Help menu now uses the “HelpURLs” set of preferences.

Limitations

There are several minimum version requirements for the servers used for the "BatCave" functions (e.g. php, mysql, alfserver). See the Project BatCave documentation for details.
Alfserver support for scriptable, key-based, per-command, environment configuration was recently extended to include support for "netrender -R key ..." This includes the ability to select which user will own the resulting prman process. Currently, the alfserver ownership mode "login" is not a viable option for netrender connections. This is because netrender and prman exchange some of their data on sockets which are connected to stdin and stdout; these connections do not survive the login set-up. The alternative "setuid" mode works as expected, and is frequently a better choice anyway, from an administrative point of view. Note that "login" continues to be supported for RemoteCmd usage, although again, "setuid" is usually the more manageable choice.

Additional Changes for 6.5.1

Bug Fixes

An intermittent problem spooling new rendering jobs to Alfred from within Maya on Mac OS X as been fixed.
A problem which prevented Alfred from being able to open "../resources/alfred.brt" on Mac OS X when installed remotely on a case-sensitive file system has been addressed.
An issue with unicast metrics delivery interruption from Windows alfservers that occured when the maitre_d shut down has been fixed. Note that the full fix for this problem requires an updated alfserver.exe (12.5.2 and beyond).
An problem causing alfserver to sometimes repeatedly request alfsite.ini from the maitre-d has been fixed.
An issue with the envkey settings in alfserver.ini on Windows that prevented establishing a correct RATTREE has been fixed.
Job queries by hostname on the BatCave's main page now execute properly.
A potentially serious bug has been fixed regarding the way in which very long metrics definitions are buffered during transmission.

Additional Changes for 6.5.2

Enhancements

Threaded metrics processing — The Alfred maitre-d can now be configured to handle inbound metrics reports from Alfservers using a separate thread within the maitre-d process. See the alfred.ini setting "metricsReceiveThreaded" for the configuration information. This option should be considered at sites where the maitre-d's slot assignment throughput is diminished by large numbers of inbound metrics reports. (NOTE: This is for Linux and OS X only.)
New low-latency pings have been implemented for the Alfred maitre-d. This allows sites that are using metrics to rely on them as preflights for costly pings. See the Low-Latency Pings discussion for details.

Bug Fixes

A bug in the "maitre-d initializing" wait period has been fixed. An incorrect conditional test allowed some slot assignments to occur before the entire initial-wait period had expired.
A bug that caused the Alfred maitre-d to crash due to a dispatcher submitting "reconstruct current state" messages before disconnect messages from the prior instance of that dispatcher were fully processed has been fixed.