Alex Blog: January 2012

Two Months of Chinese Language - Stories of Its Usefulness

Saturday, January 28, 2012

In my last post, I explained my recent shift in priorities, which resulted in one of my hobbies, Chinese language learning, moving to number one position. I spent ~2 months studying Chinese in preparation for my vacation to Taiwan by myself, to test myself by seeing how much of a language one can learn in two months.

How did I score on my self-test? Not as good as I hoped, but still successful. While I was only able to understand a few words of each sentence, I wasn't able to grasp much meaning. I *was* successful in communicating on a few occasions. I was city-walking, trying to find the hiking path to the top of a large hill. I stopped a guy walking on the street with a "dui bu qi" (excuse me) and explained that "wo xiang qu zhe li" (I want to go here) and pointed to my map. Success! He spoke some fast Chinese that I didn't understand, but he also used hand motions! Straight ahead and left! Alright! I found the mountain, but got couldn't find the hiking trail to go up.

I saw another bored-looking guy, so I asked him "wo xiang qu shang" (I want to go up). I hoped my language didn't sound like a caveman with such simple sentences. I guess my tones were right, because he didn't look offended as if I had insulted his mother. He also used hand motions! Success!

I spent some time in Japan, and made a good friend who is from central Taiwan. I took the opportunity to meet up with her again, and I spent a few days with her family. Her family was very welcoming toward me. They had a car! This was so nice to see after city-walking in Taipei for 5 days. They took me to a few of their favorite restaurants. Real chinese food is not street food? Trip-changing experience! Home-made food is what? Noodles, rice, and veggies! So interesting! They drove me to a mountain where some of the best Oolong tea is made. How educational! I didn't know Oolong tea could be so delicious! And I didn't know what tea fields look like. Rows of bushes on hillsides in the clouds! Beautiful, educational!

I felt so bad about not being able to make fulfilling meal-time conversation. I wish I could have told them what my life was like, and what I found interesting about their lives. I wish I could have thanked them in better Chinese. I hope my gestures and thoughts of thanks were picked up by their sense of empathy. If I say "xie xie" (thank you) five times in a row, does that properly mean "Thanks! I owe you so much, and your home and family is so awesome! I had so much fun!"? I sure hope they got the message.

The one part of the language that I totally failed at? Ordering food. At most of the restaurants I visited in Taiwan, there are no pictures of food you can use to decide what to order. There's a sheet of paper with a grid on it. One column of the grid is filled with Chinese words for foods. The other column is for you to place checkmarks. This is broken up into categories. So, if McDonald's used this concept (they totally should), to order a burger, you need to take a sheet of paper, and put a checkmark next to 'hamburger', 'cheese', 'pickles', 'bacon', and 'lettuce', and don't forget to put a checkmark next to 'fries'. This is a very efficient way to order, I think, but if you can read *none* of the words, you just put checkmarks next random words that you like - Maybe a word has a simple letter or it's one that you recognize. More than once, I was surprised by what I got, and I still have no idea how to order it again. I need to learn more food words next time I travel to Taiwan or China.

Priority Change - Chinese Studies Promoted

Most of 2011 has seen me diligently studying the art of software development. It's a very deep topic that could keep me occupied for the rest of my life. I'm lucky to be able to work and stay interested in such a deep discipline. I've developed a few other interested in the second half of 2011, one of which is the Chinese language. A recent development has caused me to push my Chinese studies up to number one, ahead of software studies. This means that I won't be blogging about software for awhile. :( This kind of saddens me because I enjoy software, and investing time in it will help me out in my career as well. But, as with investments, it is smart to diversify. If the software industry dries up (can't imagine why) or my life changes drastically and I lose interest in it, my trump card will be useless. So onwards to investing time in hobbies.

Why Chinese? Why *not* Chinese? I've learned that you shouldn't have to justify your interests; it is an indescribable force that captures ones interests and it should be trusted.

Well, maybe I can find a small influence for the development of this interest. I had been building up vacation days, so I had to start thinking about what to do with them. While it would be nice in the short term, spending a week in a tropical paradise didn't suit me. I needed something that I could explore and learn about. After much thought, I decided on two ways to use my vacation time most effectively: a) Travel to a fun city that also has a software conference to attend, or b) Learn a new language and travel to a place that speaks it as a test for myself.

Which did I choose? Well, if I can decide on a good conference to go to, my company would pay for me to attend it - no need to spend my vacation time. So I decided to test myself. I bought a round trip ticket to Taiwan, scheduled for 2 months in the future, and attempted to learn as much Chinese as I could in two months.

How much Chinese did I learn before departing? I was pretty motivated during those two months. I may write another blog post detailing my strategy, which proved to be pretty effect, but I can summarize it here. I listened to many hours of basic Chinese phrases in situations. I had to listen to each lesson ~5 times before I was able to pick up any words. Separately, I started doing flashcards. It is pretty easy to find flashcards for all the basic Chinese, such as 'Thank you", "Goodbye", "This is delicious", and "Where's the bathroom". I tried to learn ~20 new words each day (probably more in reality). I think I had completed a deck of 500 words before leaving, and crammed another 200 on the flight to Taiwan (it was a long flight).

Read part two of this post here, where I tell stories about the few times that my Chinese studies paid off.

2011 Year-in-review and Personal Progress

Saturday, January 14, 2012

2011 has ended and 2012 begins. I sure hope I'm not the same person that I was 1 year ago. So how have I changed? Where have I improved? What new skill have I learned? What is my living situation and happiness level?

I've learned a lot in the last year. I'd like to document a few of the things here, so I can fondly look on it at a later time.

Separating presentation from data on a web page.
I spent a few weeks this year lightly researching client-side web applications, even creating a prototype web application that is highly responsive using Backbone.js. This is one part of software that I thought I would never understand because of Javascript, Ajax, and talking across the network, so I'm pretty proud of this one.

MVC and other presentation patterns.
While I'm still not an expert in this area, I think my knowledge is now greater than a large percentage of my peers. I don't have a single piece of software to demonstrate my new-found knowledge, which is true for much of what I learn, this knowledge will make itself evident in other software I write moving forward. Varieties of MVC are present in most areas of software engineering now, iOS, Android, web pages, and desktop applications, they all use a variety of MVC to keep their applications clean.

Salesforce development.
Knowledge of writing applications for Salesforce is not directly transferable to other disciplines, it still allowed me to learn and practice methods of controlling complexity, updating legacy (to me) code, and querying information from and persisting information to a database. Also, while in Salesforce land, I was able to practice object-oriented programming, which was a roller-coaster ride of "Yes, I totally get it!" and "I understand nothing!" feelings.

Version control systems.
I did quite a bit of research and reading on various version control systems. I researched not only the version control software tools themselves, but also the software project management practices that release processes that rely on these software tools. It's a complex topic that spans technical areas, human management, and gray-area decisions. While I learned about the comprehensive basics, I feel that there is still more practical stuff to learn.

Professional recognition.
I was invited to join the experts program, which places me as a bust on the masthead of my company's ship. I write well-crafted blog entries a few times a month, which give me an outlet for the technical thoughts in my head. I also set up a tech talk at work, for which I created a slideshow and practiced a talk that introduces the technical basics of Git. I presented the tool to the entire software team in a factual manner, and intend to give a follow-up presentation to discuss the release-management styles that are used with distributed version control systems. I am making it my goal to give more tech talks this year - a solid topic once per month.

Non-work related hobbies.
Besides work, I've decided to learn Mandarin Chinese. It's a challenging language, but I am enjoying studying it. It's far easier than Japanese grammatically, but I think that proper pronunciation will elude me for quite some time. It hasn't been my number one priority, which is software, so it only received ~10 hours per week, which I worry is not enough. I took a vacation to Taiwan in September to test my knowledge. I found that I can speak very little after just 2 months of studying, but it was enough for me to ask "Which way?" questions and "Please help me." phrases. I have learned a number of words now, enough so that I may be able to start chatting with Mandarin-speakers on the internet to learn more. Conversation flow and phrases is pretty difficult.

I hope 2012 brings me as far as 2011 brought me. As long as I keep improving and I'm recognized for it, I'll be happy working with my current company.

Alex Reads - MS Research - Cohesive and Isolated Development with Branches

Post-read thoughts -
This paper seems like it was written by amateurs. Note that I am not a member of the academic community, nor do I write academic papers, so this is more of a comment on their writing style and their ability to defeat my BS filter (i.e. Can you prove that? How exactly do you define 'x'?).
Having said that, there are some useful ideas and interesting results from their interviews and research with real projects. Here's what I found interesting:

Studies show that branch usage greatly increases with new adoptees of DVC.

Pre-DVC, 1.54 branches/month. With-DVC, 3.67 branches/month (though I worry about methods used to obtain this info)
The idea that prior to DVC, branches were created only for releases, not new features.
To effectively use DVC branches, create one for each new feature, localized bug fix, or maintenance effort.

Studies show that even with DVC, a central repo is still used. (It is important to admit this, IMO)

An accessible DVC repo enables anyone to contribute to the project. Developers without commit privileges were reduced to working w/o VC. Accepting changes from unofficial project members has high barriers.
Academics advise us to checkpoint code at frequent intervals in a place separate from the 'team repo'. Only tested and stable code should be integrated into the 'team repo'. DVC systems enable and encourage this practice.

The term "Semantic conflict" - All VC systems are good at syntactic conflicts, but not semantic conflicts.
Awareness of 'Distract commits', which are commits that are required to resolve merge conflicts.

Link to Microsoft Research paper -
Introduction web page - http://research.microsoft.com/apps/pubs/default.aspx?id=157290
Research paper [PDF] - http://research.microsoft.com/pubs/157290/paper.pdf

Abstract. The adoption of distributed version control (DVC), such as Git and
Mercurial, in open-source software (OSS) projects has been explosive. Why is
this and how are projects using DVC? This new generation of version control supports two important new features: distributed repositories, and history-preserving
branching and merging where branching is easier, faster, and more accurately
recorded. We observe that the vast majority of projects using DVC continue to
use a centralized model of code sharing, while using branching much more extensively than when using CVC. In this study, we examine how branches are
used by over sixty projects adopting DVC in an effort to understand and evaluate
how branches are used and what beneﬁts they provide. Through interviews with
lead developers in OSS projects and a quantitative analysis of mined data from
development histories, we ﬁnd that projects that have made the transition are
using observable branches more heavily to enable natural collaborative processes:
history-preserving branching allow developers to collaborate on tasks in highly
cohesive branches, while enjoying reduced interference from developers working
on other tasks, even if those tasks are strongly coupled to theirs

Introduction

Purpose of Version Control

Create isolated workspace from a particular state of the source code.
Can work within one branch without impacting other developers

Purpose of branches

Should be 'cohesive' so that a team can work together on a branch
Keeps new features separate, and allows merging features when complete

Evolution of VC systems

Marked by 'increasing fidelity of the histories they record'
1st gen - record individual file changes - can roll back individual files (RCS)
2nd gen - record sets of file changes (transactions) that can be rolled back (CVS)
3rd gen - records history of files even through branching and merging (DVC)

DVS features

Every copy of a project is a complete repository, complete with history
Can change source code changes with other peer repositories
Preserves history through branches and merges

Each child commit tracks its parent commits - across branches and merges
Allows us to quantitatively study of branch cohesion and isolation
Allows us to study relationship in branch usage with defect rates and schedules delays

Why has DVC become so popular?

Developers wanted to use branches, but experienced "merge pain" with CVS

Studies show that branch usage greatly increases with new adoptees of DVC
Studies show that even with DVC, a central repo is still used
Can observe that branched history can be linearized into a single 'mainline' branch

RQ2 is "How cohesive are branches?"

'Cohesivity' is measured by directory distance of files modified in a branch (wha?)
Compare branch cohesion in Linux history against trunk branch cohesion
If branches are not more cohesive, then either a) trunk is more cohesive or b) directory distance is not a good measurement for 'cohesivity' (lol)
Results - branches are far more cohesive than background commit sequences (background?)

RQ3 is "How successfully do DVC branches isolate developers?"

VC is good about flagging syntactic changes between branch-time and merge-time
VC is not good about flagging semantic changes between branch-time and merge-time

Semantic = assumptions made during development (so, API/method changes?)
Branch coupling causes semantic conflict

Semantic conflict is number of files in branch that was also modified in trunk since fork
Measure how often a semantic conflict would interrupt a developer if using no branching

Paper proves three things

Prove that branching, not distribution, has driven popularity in DVC
Define two new measures, branch cohesion and distracted commits

'Distract commit' are new commits required to resolve merge conflicts

Show that branches are used to undertake cohesive development tasks
Show that branches effectively protect developers from concurrent development interruptions

Theory

History

Git and Mercurial basic history - birth, growth, majority use in Debian
Adopting new VC is very difficult - citing experiences by Gnome, KDE, and Python

RQ1 "Why did projects rapidly adopt DVC?"

Interviews show that main reason is to use branches for better cohesion and isolation
Exactly how cohesive are branches? How well do they isolate feature teams?
If developers use branches to isolate tasks, branches will be cohesive. On the other hand, developers could use branches merely to isolate personal development work, without separating work into tasks

RQ2 "How cohesive are branches?"

Coupling and Interruption

Should checkpoint code at frequent intervals separate from 'team repo' - only tested and stable code should be integrated into 'team repo'
When ready, integration must not be difficult or gains of personal branch is lost
When not using branches, changes are not proven stable, require integration work
Studies show that resuming from interruption takes at least 15 minutes

RQ3 "To what extent do branches protect developers from integration interruptions caused by concurrent work in other branches?"

Methodology

Began with interviews to developer hypothesis regarding motivations for adoption
Empirically evaluating by performing statistical analysis
Semi-structured interviews (sounds like high probability for introduction of non-scientific bias)

Evaluation

Description of linearizing a branched DVC history

Project concurrent sequence of changes onto single timeline
Commits on this timeline represent changes 'across' branches

Rapid DVC adoption

Observe that, contrary to common knowledge, most DVC projects do not make use of distribution

Of 60 projects, all but Linux use centralized model around single public repo

(this doesn't make sense. I think their understanding of 'distributed' is off)

Some branches that grew too different from trunk had to be abandoned
Prior to DVC, branches were created only for releases, not new features
Pre-DVC, 1.54 branches/month. With-DVC, 3.67 branches/month
Developers without commit privileges were reduced to working w/o VC

Accepting changes from unknown devs required huge patch sets

Could not add incremental work
Sometimes included unrelated changes

Therefore, main motivation is branching, not distribution (define "distribution"?)

Cohesion

Large systems structure their files in a modular manner - related files are located nearby (I question this premise)
[Science! Graphs are shown, descriptions and explanations are given]
Results show that branches are relatively cohesive.

Interviews are consistent - branches are created for more than releases (low standard)
DVC branches comprise features, localized bug fixes, and maintenance efforts
Three interviewees indicate that non-trivial changes would have been created offline and then commited in a single mega-commit

Coupling and Interruptions

[Hardcore science! Too difficult to understand. Questionably scientific pictures]

Trying to identify and quantify 'semantic conflicts'

Some disclaimer that git allows 'hidden' history in unpublished commits, hidden by rebasing

Related Work

This paper's main concern is to study history-preserving branching and merging

Some people advocate even finer grained history retention
Some people advocate automating information acquisition, such as static relationships

Some people recommend patterns to use for workflows that effectively use branching

Other people advocate workflows that mitigate branching/merging issues

Somebody proposes current tools and project management is inadequate

Bit of MVC History and Thoughts on the Proliferation of Competing MVC Flavors

Monday, January 2, 2012

While I've been exploring various implementations of presentation patterns/frameworks in Javascript, I've started questioning the MVC (Model-View-Controller) pattern as a whole. What problems does it solve? How is it different competing presentation pattern ideas, such as MVP and MVVM? I'll use this blog post as a way to organize my findings and thoughts. I'll give a bit of history first, then give a bit of speculation at the end.

MVC is an architectural pattern, the purpose of which is mainly code organization and separation of concerns. It was conceived a long time ago (1979) by Trygve Reenskaug. He was a member of the Smalltalk community in the early days of GUI design, and took part in the early conversations of various patterns for organizing code when creating solutions for handling user input in a GUI context. He authored his first paper on MVC, titled THING-MODEL-VIEW-EDITOR, which details one such pattern. The community later distilled these terms, explained here, to become model, view, and controller, as defined in this revised paper.

It is important to note, however, that this architectural pattern was conceived before complex internet pages and internet applications were possible. Rather, this first conception of MVC was a GUI solution within the problem domain of desktop applications. I believe this style of MVC, which uses multiple layered views, is used on OS-level platforms, such as and now seems incongruent as a pattern for web app servers and page generation. Internet pages and internet applications have a much different set of limitations than desktop programs - the most notable of which include the stateless nature of HTTP and the added cost of sending data back and forth across the wire between the client and the server.

Because of the popularity of the MVC pattern, it was used as the pattern for delivering web pages in the internet age. Because of the differences in the problem domain, however, the pattern evolved to fit the new problem domain. It is possible that this general incompatibility was one of the central reasons for the many MVC spin-offs that have been conceived since then, though it is just pure speculation. An equally qualified reason would be that people started using MVC without fulling understanding the reasons behind the existing MVC, or without knowing that an existing MVC existed.

I wonder about the reader's thoughts. Do you think the reason for the proliferation of varying ideas of MVC is that the domain changed to the web? Or is it because people started using it while having a poor understanding of its reasoning and concepts? Is this a waste of the brain's processing time? Maybe, but I enjoy it.