Homework 3: Strings

Due: 2019-10-13 2019-10-18 at 23:59

Preliminaries

First, find a partner. You’re allowed to work by yourself, but I highly recommend working with a partner. Click on the assignment link. One partner should create a new team. The second partner should click the link and choose the appropriate team. (Please don’t choose the wrong team, there’s a maximum of two people and if you join the wrong one, you’ll prevent the correct person from joining.)

Once you have accepted the assignment and created/joined a team, you can clone the repository on clyde and begin working. But before you do, read the entire assignment and be sure to check out the expected coding style.

Be sure to ask any questions on Piazza.

Coding style

This section is identical to the Coding style section of Homework 2.

You should run clang-format on each of the C source and header files you write. This will format all of your files consistently.

If you use NeoVim or Vim as your editor, you can include the line (called a mode line)

// vim: set sw=2 sts=2 ts=8 et:

at the bottom of each of your files to force Vim to indent by 2 spaces and to ensure that tabs will insert spaces. You can set options in your ~/.vimrc file, creating one if necessary. For example, on clyde, I have the simple ~/.vimrc (note that this is slightly expanded from the one given in the homework 1 write up to set the values for C files).

set background=dark
filetype plugin indent on
autocmd FileType sh setlocal shiftwidth=2 softtabstop=2 tabstop=8 expandtab
autocmd FileType c setlocal shiftwidth=2 softtabstop=2 tabstop=8 expandtab

The first line tells Vim to use colors suitable for a terminal with a dark background. The second line tells Vim to use file-type aware indenting. The third line tells Vim to set those options for shell script files. And the fourth line does the same thing, but for C files. See the Vim wiki for more details.

If you use emacs, you’re kind of on your own. Feel free to ask on Piazza, search StackOverflow, and read the Emacs Wiki.

Same with Nano. This might be useful.

Code warnings and errors

Your code must compile without error or warning when compiled with -std=c11 -Wall.

Additionally, for this assignment, you need to compile and link your code with -fsanitize=address,undefined. To do this, we need to make some changes to your Makefile rules. Take a look in the skeleton Makefile provided with the assignment code.

Notice how there is a new variable LDFLAGS. This is where flags passed to the linker go. Since -fsanitize=address,undefined need to be passed to both the compiler and to the linker, we include them in both CFLAGS and LDFLAGS. The rule to compile binaries now

foo: a.o b.o c.o
        $(CC) $(LDFLAGS) -o $@ $^

Submission

To submit your homework, you must commit and push to GitHub before the deadline.

Your repository should contain the following files

├── Makefile
├── README.md
├── tests
│   ├── common.sh
│   ├── run_tests
│   ├── test_caesar
│   ├── test_help
│   ├── test_pig
│   ├── test_reverse
│   └── test_shouty
└── wordmod.c

It may also a .gitignore file which tells Git to ignore files matching patterns in your working directory (like *.o, for example).

You may also split your implementation of wordmod into multiple source and header files. These should also be included in your repository.

Any additional files you have added to your repository should be removed from the master branch. (You’re free to make other branches, if you desire, but make sure master contains the version of the code you want graded.)

The README.md should contain

Each of your source files with a main function should contain a comment at the top of the file that contains usage information plus a description of what the program does.

Example.

// Usage: ./wordmod [OPTIONS]
// 
// Wordmod will ...

Part 0. Makefile, testing, and Travis CI

Makefile

The assignment code comes with a bare-bones Makefile. You need to expand upon it as you work through the following parts. Use a pattern rule to compile .o files from the corresponding .c files. Make sure each .o file depends on the appropriate header files and each program you write depends on the appropriate .o files.

The all target should depend on all of the programs you write.

The clean target should delete all of the programs you write and all of the object files. (Using *.o in clean is perfectly acceptable.)

Testing

The assignment code also comes with a set of testing scripts in the tests directory. You can run them by running $ make check.

There are a handful of tests in the testing directory. You need to write more tests to test the functionality of each part.

As you work through the parts below, run $ make check frequently to make sure that you haven’t broken any code you previously had working.

Travis CI

This homework also uses the Travis CI continuous integration service. Each time you push your code to GitHub, Travis will download the latest version, build your program and then run the tests.

You should be able to see the status of your latest commit by logging in to Travis CI using your GitHub account. You can also see the latest status from Travis by clicking on the 1 branch link near the top of the GitHub page for your repository which has URL https://github.com/systems-programming/${repo_name}/branches. You should see either a green check mark if all tests passed or a red X if one or more tests failed.

You should add a build status image to your README.md by following the instructions to get the Markdown code to paste into README.md.

Part 1. Overview and command line arguments

In this assignment, you will write a program called wordmod that will perform a variety of modifications to words read from stdin and written to stdout. The choice of modification will be controlled by command line flags.

The basic flow of wordmod should be the following.

  1. Use getopt(3) to parse command line arguments and determine which transformation, if any, is to be applied to the input.
  2. In a loop,
    1. Read in and print any characters that do not form part of a word;
    2. Read in a word of up to 45 characters;
    3. Perform the requested transformation by calling an appropriate function on the just-read word; and
    4. Print the transformed word.

There should be a command line option for each transformation described in the subsequent parts. In addition, -h should print out a help message with usage information and exit with exit value 0. Any unknown options should cause the help message and usage information to be printed and exit with exit value 0. (getopt() itself will print a message about invalid options.) As usual, error messages should go to stderr and normal output should go to stdout. (-h produces normal output, unknown options produce error messages.)

Here’s example output.

$ ./wordmod -h
Usage: ./wordmod [OPTIONS]

Options:
  -c NUM  Caesar cipher, shift by NUM
  -h      show this help
  -p      Pig Latin
  -r      reverse
  -s      shouty

If the user specifies multiple transformations, print an error message (on stderr) and exit with a nonzero exit value.

If the word you read in is longer than 45 characters, do not perform any transformations on it. Just output it as is. For the purposes of this assignment, a word is considered a consecutive sequence of letters, digits, and apostrophes. You may want to write a function isword().

// Returns true if ch is a letter, digit, or apostrophe.
bool isword(int ch);

Take a look at the man pages for isalpha(3) and isdigit(3).

For each modification, you’re going to call a function and pass the string in as input. The modification function will write the modification to another string. For most of these, it’s important you don’t use the same array for input and output. C won’t complain, but your code will likely crash or otherwise misbehave!

You want something like the following.

while (!feof(stdin) && !ferror(stdin)) {
  char word[INPUT_SIZE+1];
  char modified[OUTPUT_SIZE+1];

  if (!echo_until_word())
    break;
  if (!get_word(word)) {
    /* Handle an overly long word. */
    continue;
  }

  switch(transform) {
  /* ... */
  case 's':
    shouty(modified, word);
    break;
  /* ... */
  }
  /* print the modified word */
}
where
feof(3) and ferror(3)
are standard library functions, check their man pages;
echo_until_word()
prints characters until it encounters a word character or EOF, returning true if a word character was encountered and false if EOF;
get_word()
stores at most INPUT_SIZE word characters (plus a trailing 0 byte), returning true if the next input character is either EOF or not a word character.
transform
is the selected transformation; and
shouty()
is defined in Part 2.

For each of the functions described in the remaining parts, you might want to add a size_t size parameter that is the size of your output array. You would modify the code above to call it like

    shouty(modified, word, sizeof modified);

This size might be useful if you use strlcpy(3)/strlcat(3). Note that if you want to use those functions, you need to modify your Makefile to include -lbsd at the end of the recipe for wordmod:

wordmod: $(wordmod_objs)
        $(CC) $(LDFLAGS) -o $@ $^ -lbsd

Part 2. Shouty

The first transformation is shouty. For this part, you should write a function

void shouty(char *output, char const *input);

which takes a string input and copies it to the string output, making each character upper case.

This transformation should be used when the user passes the -s option.

$ echo 'Sometimes I feel shouty!' | ./wordmod -s
SOMETIMES I FEEL SHOUTY!

The toupper(3) function may be useful.

Part 3. Reverse

The next transformation is reverse. For this part, you should write a function

void reverse(char *output, char const *input);

which takes input and reverses it into output.

This transformation should be used when the user passes the -s option.

$ echo 'TACOCAT si a emordnilap' | ./wordmod -r
TACOCAT is a palindrome

Part 4. Caesar cipher

For this transformation, you should implement the Caesar cipher with a variable shift count. Letters should retain their case, digits and apostrophe should be unchanged. The amount to shift should be given as an argument to the -c option. Positive shift amounts should shift forward in the alphabet, negative shift counts should shift backward. (See the examples below.)

Implement

void caesar(char *output, char const *input, int shift_amount);

which shifts each letter character of input by shift_amount.

$ echo '0 shifts given' | ./wordmod -c 0
0 shifts given
$ echo 'Fuvsg ol 13; irel fuvsgl!' | ./wordmod -c 13
Shift by 13; very shifty!
$ echo 'Vkliw lqwr uhyhuvh.' | ./wordmod -c -3
Shift into reverse.

Check out the man pages for isupper(3) and islower(3).

Part 5. Pig Latin

For this part, you’ll be implementing a transformation to turn English into Pig Latin (don’t worry about the rules described on Wikipedia, just follow the ones given below).

Implement the function

void pig(char *output, char const *input);

according to the following rules.

  1. If the word contains anything but letters (which for our purposes means it contains a digit or apostrophe), don’t transform it.
  2. If the word contains no vowels—a, e, i, o, u, or y—don’t transform it.
  3. If the word starts with a, e, i, o, or u, append “yay”.
  4. Otherwise move all of the characters before the first vowel—a, e, i, o, u, or y—to the end of the word and append “ay”.

The exception is that “qu” should be treated a consonant in these rules. That is, “qu” remains “qu” (because rule 2 says it has no vowels) and “quite” becomes “itequay” rather than “uiteqay” (because rule 4 says the first vowel is the “i”).

In addition to the rules above, you need to modify capitalization such that if the input word stars with a capital letter, then the output word should start with a capital letter and the original capital letter should be lowercased. (See the examples below.)

$ echo "That'll do, pig." | ./wordmod -p
That'll oday, igpay.
$ echo 'Linux is pretty OK.' | ./wordmod -p
Inuxlay isyay ettypray OKyay.

You may find strcspn(3) useful.