Ensuring that contributors are correctly recognised for their work is a cornerstone of the free and open source software community. Here I present a convenient script to help.
When projects migrate to git (and usually Github these days), there is usually quite a lot of enthusiasm to get up and running and the authors.txt
file is not looked at closely, or not at all.
Github does a great job of summarising the free software contributions of each member. To do this, it needs to cross-reference their commits with their email addresses. This possibility is not unique to Github though: as git is a distributed version control system, it is quite possible that other services will seek to mirror and report on code contributions in the future.
authors.txt
can be tediousFor a large project with a long history, there may be more than 100 committers, many of them designated by SVN user IDs that are unique to the project and not easily mapped to the user email address.
If the project is hosted on Google Code SVN, then Google often makes dummy gmail.com
accounts for the committers - and many people don't even use those accounts for anything else, not even as an email account. Including these accounts in a git repository is a bad idea if it is not the person's preferred email address. If they don't have that address associated with their Github account, then their commits won't be easily attributed to them at all.
Often, building an accurate authors.txt
can be tedious, but here are some steps to simplify it:
svn-authors-extract
to quickly build a template file by scanning the SVN repository.How do you know which email addresses are already matched to a Github account and which need further research?
I've written a convenient script to help
It creates a dummy repository with one commit per committer. Upload this dummy repository as a new project on Github, view the commit log page, and all the matched user IDs will be highlighted, so the missing ones will stand out and you can quickly focus your search for missing email addresses.
I've tested this with the authors.txt
file I'm building for the Sipdroid/Lumicall projects. This project comes from Google Code SVN and almost all the identities are obscured by gmail.com
addresses that may or may not be the real/preferred email addresses.
I created a dummy project on Github called sipdroid-users
and pushed the dummy commits there so I can see how many of the contributors are correctly mapped to a Github account.
A full copy of the script is below, you can also find it in the sync2git
repository:
#!/bin/bash set -e if [ $# -lt 2 ]; then echo "$0 <authors file> <dest repo for push>" exit 1 fi AUTHORS_FILE="`pwd`/$1" DEST_REPO="$2" if [ ! -e "${AUTHORS_FILE}" ]; then echo "can't find ${AUTHORS_FILE}, aborting" exit 1 fi TMP_REPO=`mktemp -d` TARGET_FILE="testing.txt" cd "${TMP_REPO}" git init . cat "${AUTHORS_FILE}" | while read ; do full_info=`echo "${REPLY}" | cut -f2 -d'=' | cut -b2-` an="`echo "${full_info}" | sed -e "s/ <.*\$//\"`" am="`echo "${full_info}" | sed -e "s/^.*</</\"`" cn="$an" cm="$am" echo "$REPLY" >> "${TARGET_FILE}" git add "${TARGET_FILE}" export GIT_AUTHOR_NAME="$an" export GIT_AUTHOR_EMAIL="$am" export GIT_COMMITTER_NAME="$cn" export GIT_COMMITTER_EMAIL="$cm" git commit -m "Test adding ${REPLY}" done git remote add origin "${DEST_REPO}" git push -u origin master