--- course_title: Intro to DH title: Software Carpentry: Command line and Version control week: Week 2 --- ### Basics! ``` ls cd head tail ``` ### Cut tables, count lines, count values ``` cut -f11 2014-01_JA.tsv cut -f11 2014-01_JA.tsv | wc -l cut -f11 2014-01_JA.tsv | sort cut -f11 2014-01_JA.tsv | uniq -c cut -f11 2014-01_JA.tsv | sort | uniq -c cut -f11 2014-01_JA.tsv | sed '/^$/d' | sort | uniq -c tail -n +2 2014-01_JA.tsv | cut -f11 | sed '/^$/d' | sort | uniq -c ``` ### Search for gender in XML file ``` grep "Mr" *.xml grep "Mr" *.xml | grep -v "Mrs" grep "Mission" *.xml grep "Mission" *.xml | grep "Miss" grep -Ec "\bMrs\b" *.xml grep -Evc "\bMrs\b" *.xml grep -E "\bMrs\b" *.xml grep -Ec "\bMrs\b|\bMiss\b" ``` ### Search for professions in XML file ``` grep "" *.xml | grep -E "\bMrs\b|\bMiss\b" | grep -Eo ",[^,]+" grep "" *.xml | grep -Ev "\bMrs\b|\bMiss\b" | grep -Eo ",[^,]+" | sort | uniq -c | sort grep "" *.xml | grep -Ev "\bMrs\b|\bMiss\b" | grep -Eo ",[^,]+" | sort | uniq -c | sort -n ``` ### Summarise tweet frequency: ``` cut -d' ' -f 1,2,3 tids-created_at.txt cut -f2 -d, tids-created_at.txt | cut -f1,2,3 -d' ' cut -f2 -d, tids-created_at.txt | cut -f2,3,6 -d' ' | uniq -c cut -f2 -d, tids-created_at.txt | cut -f2,3,6 -d' ' | sort | uniq -c ``` ### Move and arrange a corpus of computer game ROMs ``` mkdir euro find . -name "*Europe*" -exec mv {} ./euro/ \; ```