Posts tagged xml

Scripts for Monkeying with Cinelerra XML

I’ve moved my Cinelerra projects from one directory path to another and scrambled a few things in the process. I’ve written a few bash scripts which helped me sort things out.

In the process, I’ve learned a bit about the xml format which Cinelerra uses to store references to files and edits. When you add a resource, it appears twice between opening and closing “ASSET” tags with the format <ASSET SRC=”/path/to/filename”>[asset details]</ASSET>. Then every time you include that file in a track of your project, the filename appears again in an “EDIT” tag complex in the format <FILE SRC=”/path/to/filename”>[edit details]</FILE>

First, I needed to check the paths for all the resources in the file to make sure they are still valid. I called this program test_xml_paths.sh. I corrected the incorrect paths with a text editor, but found that Cinelerra was still crashing with a memory error when I loaded the project.

Second, I decided to find and eliminate all of the unused assets in the file. I originally had a forty minute video in one file, but some time ago I split the project into four ten minute chunks. When I created the original project, I had added a couple of thousands of clips for possible inclusion, so now that I’m working on the final cut, I don’t need all that bloat.

I created a function to figure out which assets were being used in the edit list and which were not: list_unused_assets.sh.

Finally, I wrote a script which would recreate a new xml project file with all the redundant assets removed: remove_unused_assets.sh.

I can’t offer any warranty on these scripts, of course, but I hope they may help somebody else. If nothing else, there’s a great example of multi-line sed editing in the last one.

Remove Unused Assets from a Cinelerra XML Project File

This is a bash script to remove all the unwanted assets from a Cinelerra project XML file. I needed it to cut out bloat from a project I was working on which had hundreds of unused resources after I split the project up into ten minute chunks. Doing so stopped Cinelerra from crashing when I loaded the project: YMMV.

Save this script as “remove_unused_assets.sh”. It needs the function in “list_unused_assets.sh” which I posted here.

#! /bin/bash
#kk remove_unused_assets.sh
#kk
#kk For cinelerra xml files
#kk this script removes the contents of asset tags which are not also used for edits

if [ -z "$1" -o "$1" = "-h" ]
then
echo “Usage: remove_unused_assets [xml_filename]“
echo “For cinelerra xml files, this script removes the contents of asset tags which are not being used for edits”
exit
fi

TMPDIR=”/tmp”
SCRIPTDIR=”$HOME/Video/video_new/common/scripts”

. $SCRIPTDIR/list_unused_assets.sh
. $SCRIPTDIR/include.inc

#assign the parameter to a variable
XMLFILE=$1

#location of the temporary files
ASSETS_FILE=”$TMPDIR/ASSETS_FILE”
EDITS_FILE=”$TMPDIR/EDITS_FILE”
XMLFILE_NEW=”$1″_new
XMLFILE_TMP=”$TMPDIR/XMLFILE_TMP”

#check if the xml file exists and exit if it does not
if [ ! -e $XMLFILE ]; then error “XML file does not exist”; fi

list_unused_assets $XMLFILE
if [ $? != 0 ] ; then error “list_unused_assets failed”; fi

cp $XMLFILE $XMLFILE_TMP

echo
echo Removing Unused Assets…
echo
echo “Working…”
echo

# loop through the list of unused assets and remove the ASSET tag lines for each
# sed command from here: http://ilfilosofo.com/blog/2008/04/26/sed-multi-line-search-and-replace/

for f in $( comm -23 $ASSETS_FILE $EDITS_FILE ); do

#to make the filename string work in sed, I have to escape all of the periods and slashes
FORMATTED_f=`echo “$f” | sed -e ‘s:\/:\\\/:g’ -e ‘s:\.:\\\.:g’`

#this complicated sed script came from the sed FAQ: http://sed.sourceforge.net/sedfaq4.html#s4.21
#in order to make the variable substitution work, I had to close the sed script with a single quote
#then enclose the variable in double quotes, then re-open the sed script with a single quote
#the -i option is for editing a file “in place”
sed -i -e ‘
# sed script to delete a block if /regex/ matches inside it
:Top
/\/ { # For each line between these block markers..
/\/ASSET\>/!{ # If we are not at the /end/ marker
$!{ # nor the last line of the file,
N; # add the Next line to the pattern space
b Top
} # and branch (loop back) to the :Top label.
} # This line matches the /end/ marker.
/’”$FORMATTED_f”‘/d; # If /regex/ matches, delete the block.
} # Otherwise, the block will be printed.
#—end of script—
‘ $XMLFILE_TMP

#diff $XMLFILE $XMLFILE_TMP

echo -n ‘ \r’
echo -n $f’\r’

done

cp $XMLFILE_TMP $XMLFILE_NEW

echo
list_unused_assets $XMLFILE_NEW

echo
echo Finished.
echo
echo The new version of the xml file has been saved as $XMLFILE_NEW.
echo

This program depends on include.inc which contains the following:

#!/bin/bash

#use colour in the output just for fun
GREEN=$(printf “\033[32m”)
RED=$(printf “\033[31m”)
NC=$(printf “\033[0m”)

error()
{
echo ERROR: $1
exit 1
}

I tried to make these scripts as tidy and portable as possible — you will, however, need to change the path for $SCRIPTDIR to the location you are using (and $TMPDIR if you don’t want to put your temporary files in /tmp).

List Unused Assets in a Cinelerra XML Project File

This is a bash script which contains a function used to figure out which assets in a Cinelerra project are not required for the edit list. I created it because I had pulled in a couple of thousand media resources for a project and wanted to eliminate bloat. I moved the project and found it kept crashing on me until I eliminated the crud. I wrote another program to call this function and then actually clean out the xml. You can view “remove_unused_assets.sh” here.

Running this script doesn’t do anything until you call the enclosed function. This function can be called from the command line with:

. list_unused_assets.sh;list_unused_assets
#! /bin/bash
#kk list_unused_assets.sh
#kk
#kk check a cinelerra xml file to see which assets have been loaded, but not used in the project
#kk provide filename to check as parameter
#kk from the command line, type “. list_unused_assets.sh;list_unused_assets

list_unused_assets()
{

SCRIPTDIR=”$HOME/Video/video_new/common/scripts”
TMPDIR=”/tmp”

. $SCRIPTDIR/include.inc

#location of the temporary files
ASSETS_FILE=”$TMPDIR/ASSETS_FILE”
EDITS_FILE=”$TMPDIR/EDITS_FILE”

#assign the parameter to a variable
XMLFILE=$1

#check if the xml file exists and exit if it does not
if [ ! -e $XMLFILE ]; then error “XML file does not exist”; fi

#if the temp file exists, delete it
if [ -e $ASSETS_FILE ]; then
rm $ASSETS_FILE
fi

if [ -e $EDITS_FILE ]; then
rm $EDITS_FILE
fi

echo
echo Checking $XMLFILE for assets unused by the project…
echo

#find all of the filenames in the xml file and strip the extra junk
grep -E “ASSET SRC=” $XMLFILE | sed -e ‘s/^.*SRC=”//g’ -e ‘s/”>.*$//g’ | sort | uniq > $ASSETS_FILE
grep -E “FILE SRC=” $XMLFILE | sed -e ‘s/^.*SRC=”//g’ -e ‘s/”>.*$//g’ | sort | uniq > $EDITS_FILE

#output the counts
echo `grep -E “ASSET SRC=|FILE SRC=” $XMLFILE | sed -e ‘s/^.*SRC=”//g’ -e ‘s/”>.*$//g’ | sort | uniq | wc -l` Total Assets
echo $RED`comm -23 $ASSETS_FILE $EDITS_FILE | wc -l` Unused Assets$NC
echo $GREEN`comm -12 $ASSETS_FILE $EDITS_FILE | wc -l` Used Assets$NC

}

This program depends on include.inc which contains the following:

#!/bin/bash

#use colour in the output just for fun
GREEN=$(printf “\033[32m”)
RED=$(printf “\033[31m”)
NC=$(printf “\033[0m”)

error()
{
echo ERROR: $1
exit 1
}

I tried to make these scripts as tidy and portable as possible — you will, however, need to change the path for $SCRIPTDIR to the location you are using (and $TMPDIR if you don’t want to put your temporary files in /tmp).

Test File Paths in a Cinelerra XML File

This is a bash script to test all the filenames and paths in a Cinelerra project to make sure they are still valid.

#!/bin/bash
#kk test_xml_paths.sh
#kk
#kk check a cinelerra xml file to make sure the file references are all valid
#kk provide filename to check as parameter

if [ -z "$1" -o "$1" = "-h" ]
then
echo “Usage: test_xml_paths [xml_filename]“
echo “For cinelerra xml files, this script checks that file paths are valid”
exit
fi

SCRIPTDIR=”$HOME/Video/video_new/common/scripts”
TMPDIR=”/tmp”

. $SCRIPTDIR/include.inc

#assign the parameter to a variable
XMLFILE=$1

#check if the xml file exists and exit if it does not
if [ ! -e $XMLFILE ]; then error “XML file does not exist”; fi

#location of the temporary file
FILENAMES=”$TMPDIR/TEST_XML_PATHS”

#set counters to zero
GOODCOUNT=0
BADCOUNT=0

#if the temp file exists, delete it
if [ -e $FILENAMES ]; then
rm $FILENAMES
fi

echo
echo Checking $XMLFILE for valid file paths…
echo

#find all of the filenames in the xml file and strip the extra junk
#send the output to a file
grep -E “SRC=|PATH=” $XMLFILE | sed -e ‘s/^.*SRC=”//g’ -e ‘s/^.*PATH=”//g’ -e ‘s/”>.*$//g’ > $FILENAMES
if [ $? != 0 ] ; then error “grep failed”; fi

# loop through the output and test if the filename is either a link or a file
for f in $( cat $FILENAMES ); do
if [ ! -h $f -a ! -e $f ]; then
BADCOUNT=`expr $BADCOUNT + 1`
echo ERROR: Missing $f
else
GOODCOUNT=`expr $GOODCOUNT + 1`
fi
done

#output the counts
echo
echo $RED$BADCOUNT Total Bad Pathss $NC
echo $GREEN$GOODCOUNT Total Good Paths$NC
echo

This program depends on include.inc which contains the following:

#!/bin/bash

#use colour in the output just for fun
GREEN=$(printf “\033[32m”)
RED=$(printf “\033[31m”)
NC=$(printf “\033[0m”)

error()
{
echo ERROR: $1
exit 1
}

I tried to make these scripts as tidy and portable as possible — you will, however, need to change the path for $SCRIPTDIR to the location you are using (and $TMPDIR if you don’t want to put your temporary files in /tmp).