Arguments Parsing

No arguments, in some tools, means to read from standard input. Perl here copies from such tools as sed(1) and awk(1) and makes it easy to operate on standard input or the given file(s).

    $ cat args
    #!/usr/bin/env perl
    print while readline;
    $ echo foo | perl args
    foo
    $ echo foo > foo
    $ echo bar > bar
    $ perl args foo bar foo
    foo
    bar
    foo

Some tools complicate matters by allowing a "-" to indicate that stdin should be consumed.

    $ echo foo | perl args -          
    foo
    $ echo foo | perl args bar - bar
    bar
    foo
    bar
    $ echo foo | cat bar - bar
    bar
    foo
    bar

Standard input is typically only consumed once; see tail(1) and in particular the -f flag for more details. One might throw an error if stdin is requested to be read from multiple times, or not.

    $ echo foo | cat - bar - bar - bar -
    foo
    bar
    bar
    bar

Some tools only allow "-" as the first (and possibly only) non-flag argument to indicate a read from standard input. Most unix tools are probably sloppier about this.

    $ cat first 
    #!/usr/bin/env perl
    if ( @ARGV == 1 and $ARGV[0] eq '-' ) {
        print while readline *STDIN;
    } else {
        die "TODO\n";
    }
    $ echo foo | perl first
    TODO
    $ echo foo | perl first -
    foo

Probably you should not name a file "-" but if you do for some reason it may need to be given as "./-" instead of just "-". In theory unix filenames can contain most any bytes. In practice there are various limitations (or disasters waiting to happen) caused by flag standards and POSIX shell automatic whitespace splitting (and globbing!!).

    $ echo minus > -
    $ perl args -  
    ^C
    $ perl args ./-
    minus

Some tools may want a timeout on a read from standard input if the input does not show up quickly enough, instead of the default, which is to block until interrupted by an external event, such as a control+c or the system reboots or etc. This is more complicated and risks false positives where the other tool is too slow to supply the output. One could check whether standard input is attached to a TTY, but that TTY could be a fake TTY created by expect(1) and the input again is simply too slow to show up and does not involve a human wondering why nothing is happening. There is a trade-off here between giving the user a better interface: "hey! you need to supply some data!" versus potentially breaking a data pipeline with irrelevant messages or errors just because the input source is taking too long to produce output.

Sloppy Flags

These may be forbidden by some standard, where the first non-flag argument causes options processing to be disabled:

    $ ls -l /etc >/dev/null
    $ ls /etc -l >/dev/null
    ls: -l: No such file or directory

However, it may make sense to allow this if one frequently must fiddle with the flags, as sticking a flag onto the end of the command line is probably easier (assuming your shell puts the cursor onto the end of the command line, which is not the case for all shells) than to navigate the cursor somewhere between the command name and where the not-option arguments begin. However, there may be a security concern if an attacker can supply flags and should not be able to adjust the flags, or if the user input accidentally is a flag instead of a filename, then it's probably better to error out rather than run with an incorrect flag. "rm foo -rf" comes to mind, which should probably be additionally guarded against with the "--" to disable options processing: "rm YOUR_FLAGS_GO_HERE -- USER_SUPPLIED_FILES_GO_HERE". So if you do use sloppy flags, avoid them on tools used in a security context or where the tool has the potential for data loss or corruption if used incorrectly. Lojban dictionary tool? Probably okay to allow sloppy flags. Database admin tool? Nope, that probably wants strict flag processing.


Source