I hiked to Heather Lake, roughly 5 miles roundtrip with 1000 ft. of elevation gain. Along the trail, we passed by stumps of old-growth trees, with new, thinner trunks shooting off the stumps. We hiked by waterfalls and over rickety wooden boardwalks. A section of trail was flooded by shallow running water. Near the lake, the trail was covered in snow. I had waterproof boots, microspikes, and poles, so there were no issues. I even had mats to sit on in the snow. It’s great to be geared up!
The lake was mostly covered in ice and mushy snow. We rested on the shore in the snow, a strange contrast to the bright, sunny, 70° weather. We could hear the roar of waterfalls on the other side of the lake, a robust flow from the snowmelt. As we milled at the lake, the number of arriving hikers started to pick up, and on our way down there were some traffic jams. We also missed a turn, and ended up doing a loop through the snow, stepping over tree branches and walking through mud.
Overall, I enjoyed the hike to Heather Lake. It was leisurely, and the views at the lake were gorgeous.
I finally read Lolita, by Vladimir Nabokov. I like Nabokov’s effusive prose, so good.
I read One Hundred Years of Solitude, about the rise and fall of the Buendía family over seven generations. At the beginning of the book, I read it as realistic fiction due to the matter of fact tone. But then flying carpets and magical elements were introduced, and I realized these things were taken for granted as completely ordinary, versus to be interpreted as metaphor. Adding to the realism of this fantasy novel, the book interwove actual historical figures and events into the story, such as the banana massacre. Every time I picked up the book, I felt somber afterwards. The decline of the family and their village is foreshadowed and feels inevitable. Buendía family members are born, grow up, live a unique and solitary existence of their own making, then die. In each generation, the children are named after other family members, and so everyone has one of a few names, and the generations follow a cyclical pattern. Events that happened prior in the book are often recalled. The weight of prior generations stack, so that by the end of the book, at the mention of a single room, several generations’ worth of memories in that room are recalled. At the end of the novel, a mystery introduced at the beginning of the book is finally revealed, and everything comes full circle.
I moved back to the westside, so I have a long commute. I started geocaching again to pass the time until traffic dies down. Oftentimes, the coordinates given for the cache are off, but they get me to the general vicinity. So I have to rely on a punny name for clues. Here are my latest finds by campus.
There was a Honeywell box geocache under a lamppost skirt near the Honeywell building.
This “basset” cache was found in between rocks in a parking lot.
This “tired” cache was found near a golf course.
This tree hugging cache was found in a tree by a parking lot.
This cache was found on the side of a bike trail. I took a travel bug to bring overseas.
This cache was tricky, because the coordinates pointed to a different lamppost. But the hint was “Black,” so when I saw the black tape I looked under the heavy metal lamppost skirt.
I took a 5-mile walk in the Nisqually National Wildlife Refuge. We rolled in when the visitor center opened at 9AM, and borrowed binoculars from the visitor center.
At the start of the trail, we saw tens of sparrows diving in the air and flapping erratically, in contrast to the steady glide of larger birds. We saw several gaggles of Canadian geese. Whenever the geese took flight, they would shatter the silence with their loud honking. On the Twin Barns Loop Trail, we tried to find the three baby owls, but apparently they had changed trees. On the Estuary Trail, we spent some time observing two statuesque herons. They slowly waded in the water, then were patiently still as they fished. We also saw crows, red-winged blackbirds, various species of seagulls, and even an eagle soaring over a narrow strip of trees in the middle of the mudflats.
The visitor center overlooks a freshwater march. As we walked farther along the trail, the freshwater started to mix with the saltwater of Puget Sound, and we could smell the saltiness in the air.
I was surprised by the length of the boardwalks. The boardwalk to get to the Puget Sound Overlook was a mile long. The landscape was surreal, flat grassy marshes and mudflats (it was low tide) as far as the eye could see in all directions.
I thoroughly enjoyed my time birdwatching. Fellow birdwatchers were all friendly, eager to share the location of any birds that were spotted. Many brought a full-size telescope or a camera with telephoto lens. As we walked back to the parking lot, we passed by a lot of families, so we were glad that we were able to enjoy the wildlife refuge when it was uncrowded. The trails are all flat, so the wildlife refuge is a place I would consider taking my parents for a relaxing stroll.
Afterwards, we walked around Olympia. I ate a crab benedict for brunch. We saw the old legislative building and the current state capitol. The gray marble interior and chandelier felt cold and unwelcoming compared to the natural beauty that the capitol building overlooks. Outside one of the chambers, there are portraits of current Washington statesmen. One portrait stood out from the rest: a man wearing black sunglasses. It turns out, that man is the Lieutenant Governor, has accomplished quite a lot as a politician, and is blind. We strolled along the nearby boardwalk at Percival Landing, which displayed sculptures along its length. We climbed a wooden tower to get a view of the lake. Then we made our way to the farmers market. All these locations were within ten minutes of each other. Olympia’s core area is conveniently walkable.
I wanted to get away from the unceasing Seattle rain (at record levels this year!), so I drove east towards Yakima, where the skies are blue and the sun beats down relentlessly. I hiked Umtanum Ridge Crest, a 6-mile roundtrip hike with 2400 ft. of elevation gain.
Though I was only 2 hours away from the Puget Sound, the Umtanum Canyon region was like stepping into another world. The coniferous trees of the Sound were swapped for desert fauna, short grasses, sagebrush. Wildflowers were in bloom—blue and purple drops, yellow flowers in star and circle shapes— peppering the rolling hills. Overgrown shrubs encroached on the trail.
There was no forest cover. The packed dirt trail was exposed, winding through hills, always with a moderate incline. We trudged along the dusty path of loose rock, walking past waterfalls and rocky caves.
After some winding turns, we could see the end, the top of a mountain. The trail turned extremely steep. Any steeper and the trail would be a scramble. There were some incredibly fit freaks of nature doing a 50K race, and they ran up and down the ridge with great agility, undaunted by the ridiculous incline. We pushed along, legs burning, but spurred on by the sight of the end of the trail.
At the top, we soaked in the panoramic view. The way in which we came had a view superior to that of the other side of the mountain. Looking behind us, we could see a massive caldera, with a single yellow tree inside. The valley undulated below us.
We ran back down the mountain, as it was more efficient than walking down slowly. The wind died down. The bugs, which gave the hike the white noise of a constant buzzing hum, swarmed thicker as we descended, no longer deterred by strong winds. I kept swatting them away from my face.
As we trekked back, we passed the familiar curves of the trail, the caves, the waterfalls, past the live railroad tracks and the green suspension bridge.
On the way home, we passed by a store that advertised in big letters, “APPLES”, “ANTIQUES”, and interestingly, “ASPARAGUS.” We stopped by for groceries and ice cream.
The next few days, my legs ached. It hurt to walk, especially up staircases, even to stand up. I will remember this hike fondly. Washington’s diversity of ecosystems is astounding!
I hiked to Lake Serene, making a detour to see Bridal Veil Falls along the way, bringing the hike to 8 miles roundtrip with 2000 feet of elevation gain.
The start of the trail was wide and flat. At around the 2-mile mark, the trail branched to climb upwards to Bridal Veil Falls. There was a steep snowfield we had to cross. I brought microspikes, which I got to use for the first time. The falls were powerful. Water beat the rocks below and produced a far-reaching spray.
On the way down from the falls, I tried glissading, but I could not stop myself on the steep, slick snow. My heart raced, as I was sliding down out of control. Luckily, there was a tree branch I could grab on to. And if I were to have fallen farther, there were some patches of shrubs below that would have probably stopped my fall. After that incident, my fellow hikers gave me advice on how to use microspikes. Instead of glissading down without an ice axe to self-arrest, they said to “trust the equipment, trust the microspikes to work.” Rather than step gingerly on the snow, they said to take firm steps to create footholds, toe-first while ascending and heel-first while descending.
Back at the juncture, we continued on towards Lake Serene. We passed the lower falls, which were nearly as impressive as the Bridal Veil Falls, also wide with a large throughput of water. There were clear swimming holes at the base of the falls. But this was not a day for swimming— during the hike, the weather alternated between rain, sleet, and snow.
The flat trail turned into a slog of switchbacks, a stairmaster consisting alternately of actual wooden stairs, roots, and rocks. At higher elevation, again we donned our traction devices as the switchbacks became completely covered in snow. After the switchbacks, we hiked through precipitous snowfields on narrow trails forged by whoever hiked before us. A one point, there was a fairly large drop from the snowpack trail into a creek. We had to slide down, cross the creek, then lift ourselves back onto the trail.
When we finally reached the lake, I was elated. I had eaten breakfast, but the hike made me hungry, and I felt a dull and growing burning in my stomach as time went on. I guzzled down a sandwich while admiring the lake, which was covered in snow. It was certainly serene, watching the quiet lake while snowflakes fell. On the way back, the clouds opened up and we saw a rainbow in the misty blue sky. I was surprised, hiking back, seeing that we had travelled so far.
The snow made this hike challenging for me, and it was not a hike that I would have been comfortable doing alone. I am thankful for my fellow hikers, who lent me their hats to keep away the precipitation, for letting me borrow trekking pulls, giving me advice, and pulling me up steep sections. Most of all, they were all very friendly, humorous, and supportive. Back in the parking lot, I felt relief, glad to have made it and flush with the feeling of accomplishment and expanded capabilities. I will feel more confident and capable doing hikes with this terrain in the future.
I hiked to Oyster Dome from Chuckanut Drive (Highway 11). The hike was 6.5 miles roundtrip and 2000 feet of elevation gain. We walked through forest dense with ferns and trees covered in emerald moss. We passed small waterfalls, some old-growth conifers, and story-tall moss-covered boulders.
Unfortunately, it was a cloudy day, and at the summit, we were surrounded by a thick fog that impaired all visibility of the Sound. We snacked in the rain, then went back down the trail. As we descended, the clouds broke. We could see shellfish farms. Underwater lines that were covered in shellfish were arranged in neat rows, akin to the rows of crops in a field.
Overall, this hike was quite enjoyable. The hike was easy enough that I stuffed my pack with stout, wine, and snacks. Since I did not have to work hard to reach the summit, I did not feel much disappointment that the fog had spoiled the view. This hike was low enough that there was no snow, only mud, making it an ideal early season hike. I wouldn’t mind hiking Oyster Dome again on a dry, sunny day, but I imagine on such days it would be thronged with people.
The 20 newsgroups dataset is a data set of posts on 20 topics, ranging from cryptology to guns to baseball. I looked at 3 measures of similarity: Jaccard, cosine, and L2. Comparing each article with every other article, and taking the average similarity for that newsgroup, we get the following heat maps:
Cosine similarity seems the most reasonable, because it considers the relative frequency of words instead of the actual frequency. Take the case where there are two articles, A and B, and article A is the same as article B, except each word in A appears twice as many times in B. The similarity measure ought to indicate the articles are highly similar. The Jaccard similarity would be 0.5, cosine similarity would be 1, and L2 similarity would be some non-zero number. With Jaccard and L2 similarity, the number of words in each article has some influence on the similarity measure, so when one article has a lot more words than another, they will appear more dissimilar.
Let’s look at the cosine similarity plot, but with values < 0.45 removed:
Pairs of similar newsgroups include soc.religion.christian + soc.religion.christian, talk.politics.guns + talk.politics.guns, soc.religion.christian + talk.politics.guns. Perhaps these two newsgroups have similar demographics. Other similar pairs include soc.religion.christian + alt.atheism and soc.religion.christian + talk.religion.misc. This seems plausible, that there is some overlap discussing religion or lack of it.
Next, we look at nearest-neighbor counts. For each article in a newsgroup, there is an article in another newsgroup that has largest similarity.
The average similarity plots are symmetric, because in the formulas for different similarity measures, for any article x and y, (x,y) and (y, x) return the same value, there’s nothing dependent on the order of the bag-of-words vectors.
The nearest-neighbor plot is asymmetric. If an article A has the largest Jaccard similarity to an article B, that does not mean that B has the largest Jaccard similarity to A. For example, say there are three articles X, Y, and Z. X and Y are similar, but Z is very different from both. If Z is most similar to, say, X, that does not mean X is most similar to Z, in this case X is most similar to Y. So, just because an article in a newsgroup M has the largest similarity to an article in a newsgroup N, does not mean that an article in newsgroup N will have the largest similarity to an article in newsgroup M.
Looking at the Jaccard nearest-neighbor heat map, these groups are similar: talk.religion.misc + alt.atheism, soc.religion.christian + alt.atheism, rec.sport.hockey + rec.sport.baseball, comp.sys.ibm.pc.hardware + comp.os.ms-windows.misc, comp.sys.mac.hardware + comp.sys.ibm.pc.hardware.
Comparing the Jaccard plots, there is some overlap in similar newsgroups, such as soc.religion.christian + alt.atheism. In the nearest-neighbor plot, there are some newsgroups that appear similar that do not seem similar in the average similarity plot, such as comp.sys.mac.hardware + comp.sys.ibm.pc.hardware and rec.sport.hockey + rec.sport.baseball. Average similarity plots appear to have a more even distribution of similarity measures, whereas the counts in the nearest-neighbor plot are mostly low with some high counts.
Using average similarity is more suited to comparing newsgroups. With nearest-neighbors, each article has some discrete influence on similarity, so disparate newsgroups could wrongfully appear similar. It could be the case that the articles in a newsgroup are extremely dissimilar to articles in other newsgroups, such as the articles in misc.forsale. Looking at the Jaccard and cosine average similarity plots, it appears misc.forsale is dissimilar to the other newsgroups. In the nearest-neighbor plot, a noticeable number of articles in misc.forsale are nearest-neighbors to comp.sys.ibm.pc.hardware, probably because there are a lot of PCs for sale, but not the other way around. Likewise, the articles in rec.sport.hockey and rec.sport.baseball might not be similar to each other, but they are more similar to each other than to other newsgroups.
Next, we look at how reducing the number of dimensions affects the quality of results for measures of similarity. Here’s the cosine similarity nearest-neighbor heat map:
Now we reduce the dimensions by randomly drawing the features with a standard normal distribution.
Wall-clock times (seconds)
With no dimension reduction, calculating cosine similarities took 202.858168125 sec, finding nearest neighbors took 0.902053117752 sec.
calculating cosine similarities
finding nearest neighbors
For dimension reduction and calculating cosine similarities, wall-clock time increased linearly with d.
Target dimension d=100 gave comparable results to the original embedding.
Now let’s look at a single article, and see how cosine similarities compare after dimension reduction.
The error is the vertical distance from a point on the scatterplot to y=x. As d increases, the sum of the errors and the standard deviation of the errors gets smaller, because more of the information about the original words in full dimensions has been retained.
Looking at the target dimension vs. sum of errors:
d sum of errors
It appears that the sum of errors asymptotically decreases as d increases.
Now we try dimension reduction with a random sign (±1) instead of a normal distribution.
sum of errors, random normal distribution
sum of errors (d), random sign
The results of dimension reduction by random sign and random normal distribution were similar. For both dimensionally-reduced matrices, the plot for d=100 was comparable to the one with full dimensions.
I tried to hike to Mailbox Peak on the new trail. It’s about 4,000 feet of elevation gain to the mailbox. At around 3,000 feet elevation gain, the snow was deeper and slicker, and the trail became steep. I would need traction devices and trekking poles. It was hailing and there was limited tree cover at that elevation. I was not feeling particularly energetic to begin with, so I turned back.
As I hiked back down through the trees, the hail turned to snow. Then as I reached lower elevations, the snow turned to rain. I passed all the familiar landmarks from my ascent: burnt trees, waterfalls, bridges, then back to leafy brush. I was disappointed that I didn’t make it to the mailbox, but I know I made the right choice in turning back.
Taking a dataset of individuals from the 1000 genomes project, with a subsample of ~10,000 nucleobases for each individual, the nucleobases were given a binary encoding based on the mode for that nucleobase position.
The individuals were from 7 African populations:
YRI: Yoruba in Ibadan, Nigeria
LWK: Luhya in Webuye, Kenya
GWD: Gambian in Western Divisions in the Gambia
MSL: Mende in Sierra Leone
ESN: Esan in Nigeria
ASW: Americans of African Ancestry in SW USA
ACB: African Caribbeans in Barbados
Plotting the first and second principal components, we see the components capture geographic information.
On the v1 axis, the populations appear genetically similar except for LWK, ACB, and ASW. The LWK population in east Africa is relatively dissimilar to the populations on the west coast of Africa. Populations ACB and ASW are even more dissimilar and have a wide spread. Perhaps there is greater genetic diversity for the ACB and ASW populations because they are more likely to have mixed ancestry. So the first principal component captures genetic similarity to west African coast populations.
On the v2 axis, we see GWD in a cluster and MSL in a cluster, and ESN, YRI, and LWK in a cluster. ACB and ASW span both the MSL and the ESN/YRI/LWK clusters. So the second principal component captures the split between the two populations on the western part of the coast (GWD + MSL) from the other central and eastern populations (ESN/YRI/LWK), while suggesting individuals in the ACB and ASW populations could have ancestry from either region.
Plotting the first and third principal components, we see the third component captures gender.
Plotting the first and fourth principal components, we see the fourth component captures whether the individual belongs to the LWK population.