Many times we need to run a script to convert a pdf into text, convert .doc files into html, etc; The point is that these commands only accept one file at a time and that is very tedious if we need to perform the same task on several files, especially when we do a script.
I propose a solution for this problem using ls, thirst, grep, awk y sh. What we will do is create the correct command line in each row and execute them with sh, and since sh will execute one line at a time, the consumption of ram memory will not increase, which with other methods can even freeze underpowered machines.
Let's see how to carry out this command sequence.
1- The first thing we have to do is introduce the files that will be used through ls:
ls --directory /camino/a/carpeta/*.ext
2- Then we will need these files to pass quotes «/ path / to group of
files«
ls --directory /camino/a/carpeta/*.ext | sed 's/^/"/' | sed 's/$/"/'
3- Now awk it will be ready to receive the data.
ls --directory /camino/a/carpeta/*.ext | sed 's/^/"/' | sed 's/$/"/' | awk '{print $0}'
Because awk has its own language we will need to separate the quotes that we want to appear to quote a text among other functions we will need to use the backslash \ Let's see how to separate some.
Separate a quote
\”
Show a backslash in the output (we will need to type three bars)
\\\
Sometimes we will need an isolating separator, only the text or the quotes that appear within the two backslashes will be output in the output:
'""'\"\'""'
4- Let's see how to rename all the files that are listed using the command mv just to enter a suffix. (Now to list the file we will need to use the combination "$ 0" whenever we need to use it)
ls --directory /camino/a/carpeta/*.ext | sed 's/^/"/' | sed 's/$/"/' | awk '{print "mv "$0" \"`dirname
"$ 0 ″" / Text-any-`basename "$ 0 ″" \ ""} '| sh
Note is added at the end as shown in the previous sequence the combination « | sh »Which redirects the pipeline to this command interpreter
Let's see some examples prepared to create a script.
Examples:
1- Convert all the pdfs that are listed into text files.
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print "pdftotext",$0}' | sh
2- Let's say that we want to apply an effect to an image but without modifying the original, let's see an example with the wave effect well known for the Windows XP logo, since it is a flag with wavy effects (to better appreciate this effect it is recommended to use as resulting image with the extension .png).
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print FS="convert -wave 25x150
"$0"","\"\`dirname "$0"`/`basename "$0" | sed '"'"s/\\\\.[[:alnum:]]*$//"'"'`-wave.`basename "$0" |
rev | awk -F . \'"'"'\{print $1}\'"'"'\ | rev`'""'\"\'""' "}' | sh
Note: several passes are made in this sequence:
- One to get the folder where the file is located with dirname
- Another to obtain the base name, but removing the extension of said file
- Another to obtain the exemption of said file.
3- Let's now see how to rename a group of files by putting the corresponding number in front of the name (numeric suffix).
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print FS="mv "$0" '""'\"\'""'`dirname
"$0"`/"FNR"-`basename "$0"`'""'\"\'""' "}' | sh
Let's see how to put a numeric prefix (put a number at the end, but before the exemption) this option is only valid if the file has a.
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print FS="mv "$0" \"`dirname
"$0"`/`basename "$0" | sed '\'s/\\\\.[[:alnum:]]*$//\''`-"FNR".`echo "$0" | rev | awk -F .
'""'\'\'""'{print $1}'""'\'\'""' | rev `\" " }' | sh
4- Let's see an example where we will have to enter data or select a group of functions, taking as an example the case where we remove password protection from several pdf files that have the same password. (In this case we will use zenity as a dialog box)
zenity --entry --hide-text --text "introduzca la clave de desbloqueo" > $HOME/.cat && ls
--directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print FS="pdftk "$0" input_pw `cat
$HOME/.cat` output \"`dirname "$0"`/`basename "$0" .pdf`-unlock.pdf\" "}' | sh && rm
$HOME/.cat
As you saw, the objective is to make a cat of a file that will be created at the beginning of the line only once and then it will be eliminated once the conversion is complete.
5- Another utility is, when we need to unzip several files compacted in .zip
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print "unzip -x "$0" "}' | sh
Example
"unzip -x "$0" "
6- Let's see an example to protect a pdf with a password allowing reading but protected against printing copy or other options, (the options listed in the dialog box will be those that will be allowed in the pdf, if you do not want to allow any of them, then select none).
zenity --separator " " --multiple --text "Seleccione los Opciones que quiere permitir" --column "Opciones" --list "Printing" "DegradedPrinting" "ModifyContents" "CopyContents" "ScreenReaders" "ModifyAnnotations" "AllFeatures" > $HOME/.cat && zenity --entry --hidetext --text "Teclee la contraseña de protección" > $HOME/.cat2 && ls --directory "$@" | sed 's/^/"/' | sed 's/$/"/' | awk '{print FS="echo \"pdftk \\\"`echo "$0"`\\\" output \\\"`dirname "$0"`/`basename "$0" .pdf`-locked.pdf\\\" allow `cat $HOME/.cat` owner_pw \"`cat $HOME/.cat2`\"\" | sh "}' | sh && rm $HOME/.cat $HOME/.cat2
With these examples it is quite exemplified how to use this option to convert, modify or rename several files with a single script and not converting them by hand one by one. The memory consumption with this option is minimal, depending on the command that is being used, since it does not convert them at the same time but one after the other.
ls --directory %F | sed 's/^/"/' | sed 's/$/"/' | awk '{print "script-convertir-video "$0" "}' | sh && zenity --info --text "Todas las conversiones han terminado"
END
Wouldn't it be a lot, but MUCH easier to do all of this using regular expressions or wildcards? I don't understand what the difference is between that and making your life so complicated with this.
The truth tahed, you have great knowledge of linux commands. Very useful!
Yes, I know we will learn a lot with him around here hahaha.
I think this is much easier:
ls -d /path/to/folder/*.ext | while read file; do COMMAND "$ file"; done
Instead of COMMAND you can put whatever you want and it works even if the files contain blank spaces as long as you put $ file between quotes. You don't need to use sed for that or generate the commands with awk. Also this launches fewer processes.
o:
for i in $ (ls -d /path/a/folder/*.ext); do COMMAND “$ i”; done;
That looks good, but if the file names contain blanks it doesn't work. 🙂
In fact, hexborg is why the output text is quoted at the beginning and at the end for each line with this option:
ls –directory | sed 's / ^ / »/' | sed 's / $ / »/'
I clarify that find can be used to search the subdirectories.
But with my trick you don't have to. ls takes the full names of the files one on each line and read reads line by line and leaves the filename in the file variable whether it has blank spaces or not. You just need to put quotes around $ file when using it in the command.
I agree that in find it can be less cumbersome. Let's take this example from the article:
ls --directory “$@” | sed 's/^/"/' | sed 's/$/"/' | awk '{print "pdftotext",$0}' | sh
The same could well be achieved like this, and it probably runs faster:
find . -type f -print0 | xargs -0 pdftotext
That said, the article is welcome, it's always good to learn about alternative ways of doing something.
If you notice the $ i is in quotes. That makes escaping whitespace unnecessary.
Yes, but the $ () operator expands the file names without putting quotes anywhere, so the variable i already catches the names of the cut files. Try it in a terminal in a directory that has files with spaces in the names.
Very good, complex, but very interesting.
this is amazing, great !!!!
Excellent, the plasticity of GNU / Linux has no limits.
Dear blogger,
I'm Natalia, Communications Manager at Paperblog. After having discovered it, I am contacting you to invite you to know the Paperblog project, http://es.paperblog.com, a new citizen journalism service. Paperblog is a digital platform that, like a blog magazine, publishes the best articles of the registered blogs.
If the concept interests you, you only have to propose your blog to participate. The articles would be accompanied by your name / pseudonym and profile file, as well as several links to the original blog, at the beginning and at the end of each one. The most interesting ones can be selected by the team to appear on the Cover Page and you can be selected as Author of the day.
I hope you are motivated by the project that we started with such enthusiasm in January 2010. Take a look and do not hesitate to write to me for more details.
Receive a cordial and affectionate greeting,
Natalia