Software Carpentry - Overview INSTAAR

Aron Ahmadia and Chris Kees

January 2015

Copy This Lecture!







Creative Commons License
Software Carpentry Overview by Software Carpentry is licensed under a Creative Commons Attribution 3.0 Unported License.

More About Software Carpentry

History

What We Teach

What We Actually Teach

How to THINK like a programmer

Who We Teach

Who We Are

Our Goals for You

We will take you on a tour of:

Some High-Level Advice

Introduction

There are a plethora of best practices available to help you scientifically compute. It is likely that you will only be able to afford a limited amount of time learning a subset of them. The purpose of this lecture is to help orient you on the path to writing software as part of your research by:

The 8 Essential Practices

  1. Write Programs for People, Not Computers
  2. Let the Computer Do the Work
  3. Make Incremental Changes
  4. Don't Repeat Yourself (or Others)
  5. Plan for Mistakes
  6. Design Flexibly for Performance, Build Accessibly for Correctness
  7. Document Design and Purpose, Not Mechanics
  8. Collaborate

1. Write Programs For People, Not Computers

1. Write Programs For People, Not Computers

2. Let the Computer Do the Work

2. Let the Computer Do the Work

3. Make Incremental Changes.

3. Make Incremental Changes.

Organize with Wikis

Use Version Control for Checkpointing and Collaboration

4. Don't Repeat Yourself (or Others)

4. Don't Repeat Yourself (or Others)

Automate common actions by saving simple blocks of code into scripts

Refactor commonly used blocks of code into functions

Group commonly used functions into libraries

5. Plan for Mistakes

5. Plan for Mistakes

Verify and Validate your Code

6. Document design and purpose, not mechanics.

6. Document design and purpose, not mechanics.

Principles of documentation

7. Design flexibly for performance, build accessibly for correctness

7. Design flexibly for performance, build accessibly for correctness

Be fluent in multiple languages

You speak multiple languages when interacting with a computer. Choosing to use a new tool, library, or language can be similar to learning a new language:

Use domain specific languages and libraries to increase your expressivity

Use REPL Environments for Development

REPL (read-eval-print-loop) environments tighten the coupling between the code you write and the results you see, increasing productivity.

REPL non-REPL
IPython and Python C/C++
Julia Fortran
Interactive Sessions Batch Systems

Collaborate

Collaborate

Reduce Complexity

Aim for reproducibility

Schedule

Closing Thoughts

You sometimes need geeks. You never need dorks.

References and Further Reading

Research Literature

Programming Languages for Scientific Computing

Matthew G. Knepley

Preprint: http://arxiv.org/pdf/1209.1711.pdf

Gives an overview of modern programming languages and techniques such as code generation, templates, and mixed-language designs. This is a preprint, so expect some rough spots.

Two Solitudes

Greg Wilson

Slides: http://www.slideshare.net/gvwilson/two-solitudes

Describes Greg's journey as a scientist and leader for the Software Carpentry project, provides some insight into the differences between industry and academics.

Best Practices for Scientific Computing

D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, Katy Huff, Ian Mitchell, Mark Plumbley, Ben Waugh, Ethan P. White, Greg Wilson, Paul Wilson

Preprint: http://arxiv.org/abs/1210.0530

Good summary paper of many fundamental practices for working with and developing scientific software. This is a preprint, so expect some rough spots.

Web References

What Every Computer Scientist Should Know About Floating-Point Arithmetic

David Golberg

Web article: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Introduction to the IEEE floating-point standard, its implications, and many of the common pitfalls when using floating-point numbers in scientific computing

Science Code Manifesto

http://sciencecodemanifesto.org

Publicly signed commitment to clear licensing and curation of software associated with research publications.