hello

i had a nasty challenge today, my case problem was having to assemble a web interface to shell scripts.
what better to use other than cgi-bin, yeah yeah, there are pros and cons to that but in this case, this
was the most feasable option to go with.

the first challenge, was the reason why i need to use the shell scripts. the server acts as a gateway client to an external system, in other words there is a shell client set of tools/commands to control external hardware systems. now due to propriatary reasons, the client tools (binaries) are not installed under /bin /usr/bin or /sbin etc. they get installed in their own path /opt/app/bin/command. in addition, the binaries are hardened not get executed or yeld results unless exclusivelly ran as root (lets not debate on this, this is how it is).

so, the first challenge was to bypass the root account limitation, how? by using cgi-bin and including the apache deamon account "httpd" in the /etc/sudoers.

the next trick was to use the software configuration file to allow httpd username execute commands thru sudo.

the third and most challenging part concerned apache itself. apache deamon environmental variable PATH includes the basic set: /usr/bin:/sbin:/usr/sbin. so how was that a challenge? well the challenge was that i did not want to write static paths to the commands in the scripts e.g. /opt/myapp/mycommand -arg1 -arg2 but to get to just use mycommand -arg1 -arg2. why so? simply not to have to rewrite a hell of a lot of scripts to include static paths (which in the long run, with software updates would cause headaches). i didnt want to create symbolic links to /usr/sbin or other existing path due to the same reason, as not to get broken links and followsymlink issues with apache. so what to do?.... after striving with this the whole day, the end result was to hack /usr/sbin/apachectl to include PATH=$PATH:/opt/myapp. now this alone did not resolve everything. for some reason, the default apache init script doesnt note this hack when executing stop/start but a graceful option works. this is easy to hack fix in the init script, just anoying having to hack an init file.
so now i got my /opt/myapp path included to the apache default path variable.

the last trick that i did was ..
instead of ending my scripts with .sh or .pl or whatever, i used .txt? did you guess why? for a very simple reason, different browsers behave in a different way, most of them indentify from the file suffix (.pl/.sh/php/.txt) what file is in question and parses the return stream in accordance. so when i used .txt as a suffice, i avoided having to use html tags and the browser was able to properly interpret \r\n.

so all in all, i got my thing to work and now i can use existing shell scripts with the propriattary software thru cgi-bin without having to hack the scripts and little effort from the server side.

i just hope i will remember these hacks the next time i have to do this again...
Nice. Very similar to what I've done MANY times in the past :)

Hacking an init script is like hacking apachectl so you're ok. But there might be another "cleaner/best practice" way to do this, the env config file for apachectl. What OS is this running on? Ubuntu/Debian distros /etc/apache2/envvars , RedHat, /etc/sysconfig/httpd or something... Scan your apachectl file to see where and what file it sources for its environment variables.
i just hope i will remember these hacks the next time i have to do this again...
Documentation is the key. ;)

Nice job.
Awesome story!
A few remarks, in my humble opinion :)

First trick: root permissions
ARRRRRRG! This one made me react. It feels too much like an abuse of sudo. The reason is I feel I cannot hold in mind all the things that Apache does on my server. I can think of a quick fix. The way I would've done it is simply create a separate sudo user appuser, and launch commands as this user with something like rsh (is there a better way to change users? I don't know.)
$ # 0 stands for localhost
$ rsh 0 -l appuser sudo /opt/myapp/mycommand -arg1 -arg2
This could've also avoided some problems that you had later on, like the Apache $PATH one.

Second trick: Configuring sudo
I'm not sure what you mean. I never configured sudo beyond adding users to the sudoers file. It'd be great if you could tell us more about it.

Third trick: Apache env variable
This one is particularly tricky. I love the idea of using httpd to launch the user. Yet setting up its env variables seems complicated. I never used it, but just like Ed says, you should grab your OS documentation and learn how httpd does that. If you want a quick hack, you could always do this
$ # On Debian 6 with bash.
$ type env 
env is hashed (/usr/bin/env)
$ # You could do something like:
$ env PATH=$PATH:/opt/myapp sudo /opt/myapp/mycmd argv
env(1) available on most (if not all) modern unices. On Linux it's part of the GNU userland (package name 'coreutils').

In case you're wondering, the current running shell doesn't get affected by this change, which makes it a pretty solid solution, despite its hackish appearance. If you want more info, beware. On Linux, the man page of env is a bit empty, but it has a good 'info' page (it is a GNU utility after all). In short:

here's the command to get help with env:
$ info coreutils 'env invocation'
I really recommend you read it, it will save you a lot of headaches for next time.

Fourth trick: No HTML
This one is my favorite. It's what I call a very useful hack. It has just one downside is that you cannot assume cross browser compatibility, especially for future upgrades. Since it's a non standard feature, it might (I said _might_) come back and bite you in the derriere. Yet this is not a big issue, and if it means that you avoided having to write annoying HTML, then hell yeah!


Finally, it seems from the description that you maintain a legacy system dragging a lot of old burdens (you have to be root, use cgi-bin, ...) . It sounds cool. Can you give us a technical description of your parc?


Leaving you with the obligatory xkcd reference.
Ed wroteNice. Very similar to what I've done MANY times in the past :)

Hacking an init script is like hacking apachectl so you're ok. But there might be another "cleaner/best practice" way to do this, the env config file for apachectl. What OS is this running on? Ubuntu/Debian distros /etc/apache2/envvars , RedHat, /etc/sysconfig/httpd or something... Scan your apachectl file to see where and what file it sources for its environment variables.
i just hope i will remember these hacks the next time i have to do this again...
Documentation is the key. ;)

Nice job.
actually i did initially try to do it with the init script for some obscure reason, it did not work out, maybe i wasnt patient enough, so i moved on forward. the apache documentation did not cover this topic very well and google kept misleading me refering to env variables concerning virtual hosts and client sessions. that is why it took me so long to resolve :P

anyhow i did try out by including the PATH=$PATH:/my/path into /etc/sysconfig/httpd. it worked. i just didnt find documentation stating that i can create my own vars just like that :P. actually security wise its a bit scary! what misguided me was the number of places where you can place this. one of those systems where parallel and sequential reading of configuration files make you go nuts. this is one of the reasosn why to go vanilla instead of using distro oriented interpretations. this is what makes linux a nightmare with so many distros, there are no two alike and each time you want to do something, you realy need to know the minute details, glitches and history.

anyhow, it works now and even thou this took me time, it took me much less than what it would have taken me to go thru all the existing automation scripts and having them redone.
there were several methods to circumvent the root issue:
- have the apache recompiled deprecating the security extension that would prohibit it from running as root and run apache as root. this would have been a total nightmare as this would have granted all integrated extension modules backdoor thru apache. a horrific scenario.
- actually sudo can be repurposed as i did, where as if you have a deamon or a user who is to execute a single command as root, you can restrict what that command is with sudo. it is a nightmare to maintain if too much is required but for example presuming that mkdir can be executed only as root, then you can grant "httpd" user account root priviliage to issue a mkdir.
- i could have attempted to reconfigure the application i am integrating into apache via cgi-bin, but that would have been a hastle for the software developer to maintain and for the adminstrators to use. you wouldnt believe how often i have to strace etc to resolve issues, fix paths, link missing libraries etc. often i have to do the dirty work because if i was to wait for the developer to fix and for the admin to implement, i can just as well postpone everything to the far future.
- the application in question is security wise hardened and would not function unless running as root or by limit granting indivudual accounts limited access. the is a multitude of reasons why this is so and the most basic reason is that the application has a designated function that needs to be audited and controlled and in the other hand the application integrates with the os hardware modules at a low level so it is simply easier for the developers to retain the actions to be executed under root. i wish it was otherwise. but this is something that concerns other hollow deep applications as well. applications are not perfect nor is unix/linux. actually at this point the granular configuration possibility in unix systems in certain cases can beat linux (some security options such as in solaris)
- i could have used php or perl to issue calls but that would have meant that i would have had to code, something that would have taken a bit time, a time that isnt budgeted and not worth it since this is a one time resolution.
- etc things i cant remember to mention now

there are a multitude of ways things can be done, the objective was to get this done with the least amount of work hours and simple for others to maintain.

in regards to the apache PATH variable, this is not to be confused.
your shell PATH variable is one thing, the PATH variable under apache is totally different!
it could have been just as well been named as $APACHEPATH to distinguish between the two.
the way it works is that when the deamon is initiated, it loads a minimun set of variables passed to it.
these can be designated in many ways which often is noticed when comparing different distros as how variables
are configured and shared across etc. anyhow, when apache runs inheriting the minimum set of paths to work with,
it passes these on to the cgi-bin. i dont know of a way where you can state that the cgi-bin would use a different
$PATH variable than what the apache daemon does and so for. by hacking more in depth, this would definetly be addressable.
but once again, a simple and elegant solution is required, something that is easy to do, and easy to maintain.
since this is not an OS $PATH related subject but an apache with the same reference, it can not be controlled from an OS config file.
ofcourse this differs by distros. debian is very different from RHEL and suse and archlinux and gentoo.

as for your request of describing the env well here is a short description:

software xyz is installed on a certified rhel linux. xyz is used to perform administrative and automation tasks
on an external system. the external system is connected thru an inband connection and a sideband connection.

loads of scripts have been created over time to ease the adminstrative and automation of tasks. these scripts were
initially developed on hpux and were developed to work with a particular version of kornshell. not the same kornshell
available on linux. the scripts had to be partially rewritten for them to function under linux kornshell or bash.
thank goodness my colleagues effort on porting these to work with linux.

sometimes we need to grant third party users a view access to be able to see results of the automations tasks.
this is not an issue when working with a shell, but then again that would mean that they would have more access
than they actually require or need. so as is in this case, as one example of this integration gimic that i did, this web interface grants a "readonly" interface
to the realtime output that is triggered and not scheduled to run as a cron task.

and yes, i my day to day work concerns legacy, propriatary and dirty setups. if you only knew all the crap
that i get to see :P
Amazing description. The system you work on seems very similar to the one I work on at my job. It's an old, 1993 old, legacy financial system running on Solaris 8 with a very old (1988) korn shell (it has ... arrays! Yaay!).

It's still weird to find some comments on some scripts like:
#!/bin/ksh
# author: Sam Body
# date: 01/05/[b]1994[/b]
And if you think that it's old so it means it must be solid and well written, oh my god how wrong this is. It's a completely disgusting shell script, that makes you want to cry and go home to your mommy.

I agree that porting scripts across unices is not just about shell compatibility. It drives me insane how each unix has their own syntax for find, grep or sed/awk ...

Writing portable shell is an art. Writing portable powerful shells makes you one with the machine.
A couple of external resources:

- I wrote an article on my blog putting together tips I have learned in the past couple of years or so.
- StackOverflow have a Unix version of their website. I've been participating more or less actively there for the past year (just got the yearling badge!). A couple of members there are actually Unix Gods, and it's great to get feedback and advice from them