::opts_chunk$set(engine.path=list(bash='C:/Program Files/Git/git-bash.exe')) knitr
Daily Log
31 October 2023
Tuesday
30 October 2023
Monday
27 October 2023
Friday – Science Hour Questions
Questions for Science Hour:
Do we know how long (i.e. across how many generations) transgenerational/epigenetic acclimatization lasts?
- Don’t really know. For mice/rats shown to persist 6-7 generations, but will definitely vary among taxa. Lab works with species with longerish generations, so has only really gone out to F2 (second generation)
What about acclimatization related to microbiome? For example, say a coral has a more heat resistant symbiont. Symbionts can be “inherited” by offspring through vertical transmission (parent to offspring) or obtained by the offspring through horizontal transmission (picking up symbionts from water column or other corals). If you “infect” a coral with a thermally-tolerant symbiont, how many generations end up with the same symbiont (on average)? (Ofc this information would be more applicable to understanding acclimatization in species with short generation times, aka not coral)
Do we know whether specific conditions facilitate acclimation/acclimatization? I’m thinking of how adaptation is facilitated by things like high genetic diversity, population connectivity, etc – is there a proxy for acclimatization?
epigenetic landscape, gene copy numbers (e.g. large #s of exons)
highly changeable environment also affects epigenome (why?)
Huttenhofer, Schattner, & Poleacek 2005:
The paper often used the term “identified” (eg. “snoRNAs have been identified by experimental and computational means”) – what do they mean by this? Do they mean we know that a specific sequence is a snoRNA? That we know what it does?
- kind of both…?
The paper said “ncRNAs often lack the characteristic features used by genefinders for protein-coding genes” and, in contrast, “have widely varying motifs”. How, then, are ncRNAs identified? The paper mentions their “secondary structure” – what does this mean?
Mentioned the possibility of siRNAs being used as a applied tool to treat human disease because of their possible functionality in changing expression of disease-related genes. Does that mean it’s possible that (presumably far in the future) ncRNAs could have be used to bolster stress/disease resilience in a conservation setting?
- yes!
Mentioned that siRNAs and miRNAs, as a part of the ribonucleo-protein complex RISC, target mRNA using base complementarity – does that mean that you can easily tell what siRNA/miRNA target (and thus their functionality) once you sequence them by simply IDing the associated strand of mRNA?
Paper stated “it is not even known whether these [more recently discovered ncRNAs] have any function, and consequently they are more appropriately considered as ncRNA ‘candidates.’” Does this imply that ncRNAs must by definition be functional? I thought the term encompassed any RNA that was non-coding, including potentially non-functional RNA bits…
- kind of a semantics issue
In corals seeing a high prop of short ncRNAs that can’t be classified as miRNAs
Don’t know if there are taxa-specific features of different ncRNA classes
26 October 2023
Thursday – Getting Registered for SICB and Science Hour Questions
TREQ is apparently undergoing system modifications right now, so I can’t get added to the system. That means I can’t individually register and submit purchasing requests for SICB. Instead, I’ve been advised to register for the conference and then submit necessary information to Jonas Louie so they can submit a reimbursement request on my behalf.
Questions for Science Hour:
Do we know how long (i.e. across how many generations) transgenerational/epigenetic acclimatization lasts?
What about acclimatization related to microbiome? For example, say a coral has a more heat resistant symbiont. Symbionts can be “inherited” by offspring through vertical transmission (parent to offspring) or obtained by the offspring through horizontal transmission (picking up symbionts from water column or other corals). If you “infect” a coral with a thermally-tolerant symbiont, how many generations end up with the same symbiont (on average)? (Ofc this information would be more applicable to understanding acclimatization in species with short generation times, aka not coral)
Do we know whether specific conditions facilitate acclimation/acclimatization? I’m thinking of how adaptation is facilitated by things like high genetic diversity, population connectivity, etc – is there a proxy for acclimatization?
Huttenhofer, Schattner, & Poleacek 2005:
The paper often used the term “identified” (eg. “snoRNAs have been identified by experimental and computational means”) – what do they mean by this? Do they mean we know that a specific sequence is a snoRNA? That we know what it does?
The paper said “ncRNAs often lack the characteristic features used by genefinders for protein-coding genes” and, in contrast, “have widely varying motifs”. How, then, are ncRNAs identified? The paper mentions their “secondary structure” – what does this mean?
Mentioned the possibility of siRNAs being used as a applied tool to treat human disease because of their possible functionality in changing expression of disease-related genes. Does that mean it’s possible that (presumably far in the future) ncRNAs could have be used to bolster stress/disease resilience in a conservation setting?
Mentioned that siRNAs and miRNAs, as a part of the ribonucleo-protein complex RISC, target mRNA using base complementarity – does that mean that you can easily tell what siRNA/miRNA target (and thus their functionality) once you sequence them by simply IDing the associated strand of mRNA?
Paper stated “it is not even known whether these [more recently discovered ncRNAs] have any function, and consequently they are more appropriately considered as ncRNA ‘candidates.’” Does this imply that ncRNAs must by definition be functional? I thought the term encompassed any RNA that was non-coding, including potentially non-functional RNA bits…
25 October 2023
Wednesday – Blast/Annotate Code Transcriptome
Scaling up from the blast tutorial yesterday, I was able to run the code Steven added to the cod project repo to blast and annotate a cod transcriptome! I was also able to log in to my new Gannet account and get it set up, so I should (theoretically) be able to store large files there now. I’m still not 100 sure how I would do that though. I’ve read through some old Github issues and stuff and I think I’d need to either ssh or rsync to connect to Gannet from Raven and then copy over the large files using that connection. I’m not not familiar with rsync, so I’m not sure what the difference between ssh and rsync is or which is preferable. I also am not 100% sure what the server host name/IP address would be to ssh into Gannet, but I’m guessing I can use the IP that Gannet emailed to me. Since I’m not confident in correctly connecting and copying to Gannet from Raven I haven’t uploaded the blast/annotation output files or pushed anything from running the rmd to github. I’ll either ask about it in lab tomorrow or Science Hour on Friday.
I’m also still not sure what a lot of the code from Steven’s blast/annotation code is doing, so I want to do a deep-dive into the commands being run.
Lastly, I finally took a moment to try registering for SICB only to find out I don’t have access to the College of Environment purchasing system, Treq. I requested access through the indicated email and will ask about it during Friday Science Hour if I don’t hear back by then.
24 October 2023
Tuesday – Learning how to BLAST from command line
After several weeks of accumulating issues with running command line and working in RStudio/Github (both locally and on Raven), I think I’ve finally resolved them all! I was able to run the tutorial blast code successfully (I think). I still want to figure out/ask about the function of all the parameters called (e.g. -evalue 1E-20), and to figure out how to interpret the output results (or at least learn how to identify possible issues in the output)
23 October 2023
Monday
20 October 2023
Friday
19 October 2023
Thursday – Getting situated in Raven (cont.)
Revelations from lab meeting today! Several of my Raven problems were solved during our raven/github-focused meeting today – notes below:
Connecting Raven with my Github. I needed to enter
```
{bash} git config –global user.email "[mygithubemailthatIdon'twanttopublicize]" git config –global user.name "shedurkin"
```and then log in to git using my username and a personal access token (which I have now created). I’m now fully connected to github from Raven and have successfully pushed/pulled!
Getting/accessing ncbi blast software on Raven. Blast is indeed already downloaded and accessible on Raven, it’s just in a higher-level directory that I couldn’t find because I didn’t know about it (I can only default view my home directory on Raven). The higher-level lab directory is /home/shared, and ncbi-blast v. 2.12.0 is already installed there!
Pushing large files to Github. Don’t! Any file (e.g. output file) sizes larger than 100MB should not be pushed to Github - instead, put it on Gannet (another lab server intended for large-file-size data and output storage). Already github-issued Sam/Steven to ask for access to Gannet.
18 October 2023
Wednesday – October Goals
After fairly continuous head-bashing (pun intended ;) ) with RStudio/bash/blast for the last two weeks I’m kind of tired of it, so today I’m going to focus on other stuff – like setting some goals! I’ve had ~3 weeks now to get settled into the lab and Steven and I have had the chance to talk about possible directions for the near future, so let’s put some stuff in writing – see October Goals post for details.
17 October 2023
Tuesday – Getting situated in Raven (cont.)
I continued trying to figure out how to work in Raven today and ran into another fun problem – my Raven RStudio isn’t connected to my github, even though I copied the directory I’m working in directly from Github in the same way I would copy a repo to RStudio on my local machine. I realized I wasn’t able to commit or push/pull, and using the command ```{bash} git config –global user.name "shedurkin"
``` didn’t fix the issue. It clearly isn’t an issue with my shell script/terminal because I’ve been successfully running command line code on Raven. Maybe Raven isn’t supposed to end up tied to github? I would think the code and stuff should all end up on github for backup/reproducibility purposes, but I can also see why sending some things to github (e.g. output files) could require an unreasonable amount of storage…
I also managed to modify the 001-blast.Rmd tutorial script to download a windows version of ncbi blast to Raven and run the .exe, but I wasn’t able to successfully access the blast.exe component used in the first task of the tutorial file (setting blast directory). Neither I nor google can figure out why with a cursory search and I unfortunately have class soon, but in the worst case I can bring this up at lab meeting on Thursday.
16 October 2023
Monday – Getting situated in Raven
Sam was able to resend me login credentials for Raven (the Lab’s primary server, implemented through a browser version of RStudio), so I started getting situated on it. Since I have a pending task related to blasting a cod transcriptome, I copied over my github repo RobertsLab_OnboardingTasks and took another look at the blast tutorial .Rmd file (001-blast.Rmd). The last time I tried to run this tutorial file (on RStudio locally) I ran into problems running bash through RStudio at all.
Thankfully that isn’t an issue working in Raven, but now I’m kind of stumped because of the need to download and run blast. I can’t find any “higher level” directories above my own Raven account that might hold relevant software like ncbi-blast, but it seems weird that I’d need to individually download the software for use on a shared server…The code included in the tutorial file for this also wouldn’t work on my machine, since it’s written with Mac-specific and outdated versions of blast. I downloaded the most recent Windows version of blast to my local machine, but I don’t think I can access local software from Raven since it’s a server.
13 October 2023
Friday – Cod Temp/Size Analysis
Cod Temp/Growth Analysis
I finished the cod temp/growth analysis assigned by Steven
R/bash issues (cont.)
I’m still having issues with the disconnect between which shell script is being used by my terminal and which is used by console in R. We realized today that my problems pushing my analysis code to the lab’s project-cod-temperature repo on github were related to the disconnect problem. I was able to pull and push to github when I was working within the repo on RStudio, but I couldn’t commit using the Git environment in the upper right pane. However, pushing, pulling, and committing manually from the terminal all worked fine. We think that, somehow, the commit action in the Git environment is using the same faulty WSL shell that my console is defaulting to, even though the push/pull actions are correctly defaulting to Git Bash. To make matters even more confusing, when I’m working within my lab notebook repo in RStudio I’m able to push, pull, and commit successfully from the Git environment with no issues. Why is my default shell script for console/some parts of git different between repos/RStudio projects?
No answers yet, but for right now I’m still able to maintain functionality while working locally because the terminal is functioning appropriately and I have a workaround to run bash code chunks included in Rmd files (see Oct.12 post).
12 October 2023
Thursday – RStudio/Bash Struggles (cont.)
I think I may have fixed the issue running bash code chunks in Rmd files! Included the following:
And now, while I still don’t seem to get any output when I try to run individual code chunks by pressing the “play” button in the top right corner, knitting the document works! I still haven’t figured out how to permanently set the default shell script for running bash code chunks from Rmd files (checking default shell using “Sys.which(”bash”)” from the console still returns “C::\\WINDOWS\\SYSTEM32\\bash.exe” aka WSL), so for now I guess I’ll just have to include that YAML block in any Rmd file I’m working in.
Started working on cod temp/size data analysis assigned by Steven but haven’t been able to push to the lab Github repo (project-cod-temperature) bc, while I successfully cloned the repo to RStudio, I haven’t been able to connect them so I’m able to push changes back to it. Will ask about it during Science Hour tomorrow
11 October 2023
Wednesday – RStudio/Bash Struggles (cont.)
I’ve continued trying to troubleshoot my issues with running bash code chunks in .Rmd files. Running Sys.which(“bash”) in the RStudio console the path “C:\\WINDOWS\\SYSTEM32\\bash.exe”, which I think is bash installed as a part of the Windows Subsystem for Linux (WSL). I have WSL installed and I’ve tried opening and running commands in the WSL bash, but it crashes every time I open it, before I have the chance to try any commands. Maybe there’s something wrong with it which is why bash chunks won’t run in RStudio? I’m not 100% convinced because, weirdly enough, the terminal in my RStudio is functional – I can run command line code from it (e.g. “pwd”) from it with correct outputs. Running “which bash” from RStudio’s terminal returns the path, “/usr/bin/bash”. I have no idea why the terminal would be functional for running command line but bash code chunks inside an Rmd file aren’t.
Ok I checked which shell my RStudio Terminal was using (Tools > Global Options > Terminal > New terminals open with) and my Terminal was using Git Bash. When I changed it to open using WSL Bash and tried to open a new terminal, the new terminal opened briefly and then immediately crashed and closed, just like when I tried opening WSL bash on my machine. Switching back to Git Bash and opening a new terminal with it worked just fine, which means the problem running bash code chunks in Rmd files must be because (for some reason) the bash code chunks in Rmd files are being run using the (apparently nonfunctional) WSL bash, instead of Git Bash! I need to figure out how to change the path so that bash code chunks will run with Git Bash – I honestly have no idea how to do this though… I would normally assume that setting a preferred shell script for terminal would also set the script for running bash code chunks. The disconnect between terminal and shell script used by console also suggests I won’t be able to change the console path to bash from the terminal so….. I’m stumped. Googling the issue is not helpful.
10 October 2023
Tuesday – RStudio/Bash Struggles
I started looking at the github tasks Steven assigned me to familiarize me with Raven and the lab’s computing practices (analyzing some weight/temp data and BLASTing/annotating a transcriptome), but quickly ran into setup issues. I made a new github repo using the lab’s project template and tried running through one of the blast tutorials included in the project template, but immediately had problems with running the bash code chunks. I’ve done command-line data analysis work before, but never through an Rmarkdown file or in RStudio. When I tried to run any of the bash code chunks nothing would happen, not even an error output. First I updated Windows and RStudio, since I had updates waiting for both of them, but that didn’t help. Since I work on a Windows laptop, which doesn’t have a default way to run shell scripts, I downloaded and normally use the software Cygwin – I think the problem running bash code in Rstudio may stem from RStudio not having a way to run bash, aka not having it’s path to run bash script directing to Cygwin? I’ll try more troubleshooting tomorrow.
09 October 2023
Monday
06 October 2023
Friday – Lit Review and Science Hour
Completed an additional employee training and went over my lit review questions (see Oct.5 entry) with Steven. Also github-issued Sam about getting added to Raven.
Questions:
General
I read that both cytosine and adenine can be methylated, but every paper I’ve read so far studying methylation only looked at methylation of cytosine – is there a reason adenine methylation is not commonly studied?
- A: adenine is only methylated in certain taxa (e.g. bacteria), and isn’t methylated in marine inverts
Is all methylation (in gamete cells) heritable/potentially heritable? Do we know? If not, do we know why/what determines heritability? What about other forms of epigenetic modification?
Similarly, what proportion of methylation (or epigenetic modification in general) affects gamete cells?
A: Differs among taxa. We have secondary evidence that it is in marine inverts. For taxa where there’s more evidence, modification has to happen during early developmental stages of gonads (presumably to alter the gamete cells). A lot of it is unknown though (similar for other forms of modification e.g. histone modification)
George et al. 2023 (Triploid Pacific oysters exhibit stress response dysregulation…)
George et al. 2023 studied polyploidy in oysters because of its common use in oyster aquaculture – does polyploidy have any potential conservation applications, particularly in eukaryotes, for whom ployploidy is often better tolerated? One scenario I can think of is intentional hybridization of two species to selectively breed a more resilient population – since polyploidy is more common in hybrids, may be important to understand its impacts. I also know polyploidy sometimes happens in corals (especially Caribbean corals), which can reproduce asexually fairly easily. (Need to read related paper Stephens et al. 2023 – w/ Putnam and Bhattacharya!)
- A: Since polyploidy reduces breeding potential so could be useful in conservation. But don;t have any information on relationship between polyploidy and methylation or the mechanisms of how polyploidy affects methylation
Confused about a couple aspects of the experimental setup:
Stated that lab obtained 500 oysters for the experiment, but then states that the starting sample size of each of the six treatments (2 controls, 2 single stressor, 2 multiple stressors) was 112, which adds to way over 500 oysters
- A:
For the multiple stressors treatment, why was aerial exposure not repeated multiple times (since low tide happens on a regular daily schedule, not just once during a given warming event)?
- A: Don’t want to go overboard and kill all the oysters, so start out trying with less stressful stress conditions. Want to understand more than just the death outcome.
In Results, why is survival probability not the direct inverse of mortality?
- A: doesn’t know, maybe check supplemental materials or ask in Slack
I think I noticed some typos… do you want to know about them or just leave it be?
Putnam et al. 2023 (Dynamic DNA methylation contributes to carryover effects…)
In the results section I saw that shell area was used as a proxy for shell size – is it assumed that shell thickness remains fairly constant? Do we know this will hold during acidification? Also, are shell size and body mass tightly correlated? (I would assume so, but it would be interesting if acidic conditions messed with that relationship)
A: Not assumed, but it wasn’t measured. Emma Tidman paper looks at effect of acidification on various measures of shell strength/size.
A: Emma also looked at body mass/tissue impacts of acidification (but yes, there is generally a correlation between shell size and body mass)
In Fig.3 I’m a little confused about the y-axes of 3c and 3d – what is the baseline (the “1”) for these relative size values? It doesn’t seem like it could be original size (as in a and b)
- A: check supplemental
What is the difference between differentially methylated regions (DMRs) and differentially methylated genes (DMGs)? Are DMRs just methylation of non-coding DNA (e.g. introns)?
- A: Regions are just more broad than genes (so regions could contain genes). The scale (in terms of number of bp) goes loci -> genes -> regions.
I have no idea what’s going on in figure 4 :(
- A: each cell represents gene and the z-score is basically showing a heatmap of gene expression. The “tree” looking thing is just showing clustering of individuals based on expression level.
or figure 5
- A: Steven is also kind of confused about this rn, but check supplementals for more
What are DEGs (pg12)? Also, are CpGs just cytosine/guanine pairs?
- DEG=Differentially expressed gene (rather than differentially methylated), and yes, CpGs are cytosine/guanine pairs
05 October 2023
Thursday – Lit Review
Continued reading through recent lab publications (see entry from yesterday) and compiled some questions about them to ask Steven at Science Hour tomorrow (see below). I also completed the required HazCom training (and returned updated training documentation to Sam), requested access to the UW chemical inventory/safety system, and wrote and uploaded a personal description, lab notebook link, email, and picture, to the lab website.
Questions:
General
I read that both cytosine and adenine can be methylated, but every paper I’ve read so far studying methylation only looked at methylation of cytosine – is there a reason adenine methylation is not commonly studied?
Is all methylation (in gamete cells) heritable/potentially heritable? Do we know? If not, do we know why/what determines heritability? What about other forms of epigenetic modification?
- Similarly, what proportion of methylation (or epigenetic modification in general) affects gamete cells?
George et al. 2023 (Triploid Pacific oysters exhibit stress response dysregulation…)
George et al. 2023 studied polyploidy in oysters because of its common use in oyster aquaculture – does polyploidy have any potential conservation applications, particularly in eukaryotes, for whom ployploidy is often better tolerated? One scenario I can think of is intentional hybridization of two species to selectively breed a more resilient population – since polyploidy is more common in hybrids, may be important to understand its impacts. I also know polyploidy sometimes happens in corals (especially Caribbean corals), which can reproduce asexually fairly easily. (Need to read related paper Stephens et al. 2023 – w/ Putnam and Bhattacharya!)
Confused about a couple aspects of the experimental setup:
Stated that lab obtained 500 oysters for the experiment, but then states that the starting sample size of each of the six treatments (2 controls, 2 single stressor, 2 multiple stressors) was 112, which adds to way over 500 oysters
For the multiple stressors treatment, why was aerial exposure not repeated multiple times (since low tide happens on a regular daily schedule, not just once during a given warming event)?
In Results, why is survival probability not the direct inverse of mortality?
I think I noticed some typos… do you want to know about them or just leave it be?
Putnam et al. 2023 (Dynamic DNA methylation contributes to carryover effects…)
In the results section I saw that shell area was used as a proxy for shell size – is it assumed that shell thickness remains fairly constant? Do we know this will hold during acidification? Also, are shell size and body mass tightly correlated? (I would assume so, but it would be interesting if acidic conditions messed with that relationship)
In Fig.3 I’m a little confused about the y-axes of 3c and 3d – what is the baseline (the “1”) for these relative size values? It doesn’t seem like it could be original size (as in a and b)
What is the difference between differentially methylated regions (DMRs) and differentially methylated genes (DMGs)? Are DMRs just methylation of non-coding DNA (e.g. introns)?
I have no idea what’s going on in figure 4 :(
or figure 5
What are DEGs (pg12)? Also, are CpGs just cytosine/guanine pairs?
General interests:
Understanding and mitigating impacts of climate change/anthropogenic disturbance on marine ecosytems
Specific interests:
epigenetics (and heritbility of epigenetic modifications), phenotypic plasticity, microbiome, disease resistance/resilience, ecotoxicology
Additional papers to read:
Eirin-Lopez and Putnam, 2019 (shows DNA methylation varies in response to environmental factors in marine inverts)
Stephens et al. 2023 (Ploidy variation and its implications for reproduction and population dynamics in two sympatric coral species)
Strader et al. 2019 (heritable epigenetic modifications and transgenerational plasticity in sea urchins)
04 October 2023
Wednesday – Lit Review
I’ve been reading through a selection of papers recently published by the lab to learn more about past/ongoing projects and begin thinking about what kind of project I might want to join/start. I also read the paper Chris selected for lab meeting tomorrow (Pham et al. 2023).
03 October 2023
Tuesday
02 October 2023
Monday – Onboarding Tasks and Trainings
Completed several lab onboarding tasks, including the reviewing the PPE Hazard Assessment, Fire Extinguisher Training, and Managing Laboratory Chemicals Training, and submitted relevant documentation to Sam.
Also emailed Sam to set up meeting time to go over relevant lab safety details in person.