Monday, February 09, 2009

head full of straw

Fuck you perl! Why is this so hard?

I want to join an array composed of sequence that I split based on position within the array. I spent the last two hours figuring out how to do this, and this is all I came up with. It works, but Jesus:

#!/usr/bin/perl

use strict;
use warnings;

my $seq = qw(GGGCCGGCTCGCGGGCGCTGCCAGTCTCGGGCGGCGGTGTCCGGCGCG
CGGGCGGCCTGCTGGGCGGGCTGAAGGGTTAGCGGAGCACGGGCAAGGCG
GAGAGTGACGGAGTCGGCGAGCCCCCGCGGCGACAGGTTCTCTACTTAAAA
GACAATGACTACTGATGAAGGTGCCAAGAACAATGAAGAAAGCCCCACAGC
CACTGTTGCTGAGCAGGGAGAGGATATTACCTCCAAAAAAGACAGGGGAGT
ATTAAAGATTGTCAAAAGAGTGGGGAATGGTGAGGAAACGCCGATGATTGG
AGACAAAGTTTATGTCCATTACAAAGGAAAATTGTCAAATGGAAAGAAGTTT
GATTCCAGTCATGATAGAAATGAACCATTTGTCTTTAGTCTTGGCAAAGGCC
AAGTCATCAAGGCATGGGACATTGGGGTGGCTACCATGAAGAAAGGAGAGA
TATGCCATTTACTGTGCAAACCAGAATATGCATATGGCTCGGCTGGCAGTCT
CCCTAAAATTCCCTCGAATGCAACTCTCTTTTTTGAGATTGAGCTCCTTGATT
TCAAAGGAGAGGATTTATTTGAAGATGGAGGCATTATCCGGAGAACCAAACG
GAAAGGAGAGGGATATTCAAATCCAAACGAAGGAGCAACAGTAGAAATCCA
CCTGGAAGGCCGCTGTGGTGGAAGGATGTTTGACTGCAGAGATGTGGCATT
CACTGTGGGCGAAGGAGAAGACCACGACATTCCAATTGGAATTGACAAAGC
TCTGGAGAAAATGCAGCGGGAAGAACAATGTATTTTATATCTTGGACCAAGA
TATGGTTTTGGAGAGGCAGGGAAGCCTAAATTTGGCATTGAACCTAATGCTG
AGCTTATATATGAAGTTACACTTAAGAGCTTCGAAAAGGCCAAAGAATCCTG
GGAGATGGATACCAAAGAAAAATTGGAGCAGGCTGCCATTGTCAAAGAGAA
GGGAACCGTATACTTCAAGGGAGGCAAATACATGCAGGCGGTGATTCAGTAT
GGGAAGATAGTGTCCTGGTTAGAGATGGAATATGGTTTATCAGAAAAGGAAT
CGAAAGCTTCTGAATCATTTCTCCTTGCTGCCTTTCTGAACCTGGCCATGTGC
TACCTGAAGCTTAGAGAATACACCAAAGCTGTTGAATGCTGTGACAAGGCCC
TTGGACTGGACAGTGCCAATGAGAAAGGCTTGTATAGGAGGGGTGAAGCCC
AGCTGCTCATGAACGAGTTTGAGTCAGCCAAGGGTGACTTTGAGAAAGTGCT
GGAAGTAAACCCCCAGAATAAGGCTGCAAGACTGCAGATCTCCATGTGCCAG
AAAAAGGCCAAGGAGCACAACGAGCGGGACCGCAGGATATACGCCAACATG
TTCAAGAAGTTTGCAGAGCAGGATGCCAAGGAAGAGGCCAATAAAGCAATGG
GCAAGAAGACTTCAGAAGGGGTCACTAATGAAAAAGGAACAGACAGTCAAGC
AATGGAAGAAGAGAAACCTGAGGGCCACGTATGACGCCACGCCAAGGAGGG
AAGAGTCCCAGTGAACTCGGCCCCTCCTCAATGGGCTTTCCCCCAACTCAGG
ACAGAACAGTGTTTAATGTAAAGTTTGTTATAGTCTATGTGATTCTGGAAGCA
AATGGCAAAACCAGTAGCTTCCCAAAAACAGCCCCCCTGCTGCTGCCCGGAG
GGTTCACTGAGGGGTGGCACGGGACCACTCCAGGTGGAACAAACAGAAATGA
CTGTGGTGTGGAGGGAGTGAGCCAGCAGCTTAAGTCCAGCTCATTTCAGTTT
CTATCAACCTTCAAGTATCCAATTCAGGGTCCCTGGAGATCATCCTAACAATG
TGGGGCTGTTAGGTTTTACCTTTGAACTTTCATAGCACTGCAGAAACCTTTTA
AAAAAAAATGCTTCATGAATTTCTCCTTTCCTACAGTTGGGTAGGGTAGGGGA
AGGAGGATAAGCTTTTGTTTTTTAAATGACTGAAGTGCTATAAATGTAGTCTG
TTGCATTTTTAACCAACAGAACCCACAGTAGAGGGGTCTCATGTCTCCCCAGT
TCCACAGCAGTGTCACAGACGTGAAAGCCAGAACCTCAGAGGCCACTTGCTT
GCTGACTTAGCCTCCTCCCAAAGTCCCCCTCCTCAGCCAGCCTCCTTGTGAGA
GTGGCTTTCTACCACACACAGCCTGTCCCTGGGGGAGTAATTCTGTCATTCCT
AAAACACCCTTCAGCAATGATAATGAGCAGATGAGAGTTTCTGGATTAGCTTT
TCCTATTTTCGATGAAGTTCTGAGATACTGAAATGTGAAAAGAGCAATCAGAA
TTGTGCTTTTTCTCCCCTCCTCTATTCCTTTTAGGGAATAATATTCAATACACA
GTACTTCCTCCCAGCATTGCTACTGCTCAGCTTCTTCTTTCATTCTAATCCTTG
CTATTAAGAATTTAAGACTTGTGCTTACAATATTTTTGACCTGGAGTGGATCT
ATTTACATAGTCATTTAGGATCCATGCAGCTTTTTTTGTCTTTTTAAGATTATT
GGCTCATAAGCATATGTATACTGGTTTATGGAACTTTATTTACACTCCTCTATC
ATGCAAAAAAATTTTGACTTTTTAGTACTAAGCTTAATTTTTAAAAACAAAATC
TGTAGTGTTGACAAATAAATAGTTGCTCTTCTACACTAGGGGTTTCACCTGCA
GGTTTGACACGCAGTTGCTCGCTTTTCCTGCCCTGTCAAGCTTCTCTGTTCTG
GCGTGAGTTGTGAAAGAGTTGAAGACAGCTTCCCATGCCGGTACACAGCCAG
TAGCCTAAATCTCCAGTACTTGAGCTGACCATTGAACTAGGGCAAGTCTTAAA
TGTGTACATGTAGTTGAATTTCAGTCCTTACGGGTAAACAGATTGAGCATGGC
TCTCTATTCCCTCAGCCTAAGAAACACTCATGGGAATGCATTTGGCAACCCAA
GGAACCATTTGCTTAAACCTGGAACATCTCACCTTTTTAAATCCTAAAAAACA
CTGGCAGTTATATTTTAAATTAGTTTTTATTTTTATGATGGTTTTATCAAAAGA
CTTTTATTATTAGATTGGGACCCCCTTCAAACCTAAAAATCAAGTTATTTCCTT
TTATAATACTTTTCTTCCCCATGGAACAAATGGGATCAATTTGTGAGTTTTTTC
CTTTAATGATAACTAAAATCCCTCTAATTTCTCATTTATGCTTTTGTCTTTTTTA
TGAAATATTTCTTTTAAAAGCCCCAGTCTCACCTACGAAATATGAAGAGCAAA
AGCTGATTTTGCTTACTTGCTAAACTGTTGGGAAAGCTCTGTAGAGCATGGTT
CCAGTGAGGCCAAGATTGAAATTTGATACTAAAAAGGCCACCTAGCTTTTTGC
AGATAACAAACAAGAAAGCTATTCCAAGACTCAGATGATGCCAGCTGTCTCCC
ACGTGTGTATTATGGTTCACCAGGGGGAACTGGCAAAAGTGTGTGTGGGGAG
GGGAAGGGTGTGTGAGTGGTTCTGAGCAAATAACTACAGGGTGCCCATTACC
ACTCAAGAAGACACTTCACGTATTCTTGTATCAAATTCAATAATCTTAAACAAT
TTGTGTAGAAGTCCACAGACATCTTTCAACCACCTTTTAGGCTGCATATGGAT
TGCCAAGTCAGCATATGAGGAATTAAAGACATTGTTTTTAAAAAAAAAAAATC
ATTTAGATGCACTTTTTTGTGTGTTCTTTAAATAAATCCAAAAAAAATGTGAAA
AAAAAAA);


my @x = split //, $seq;

my $exon_breaks= "0,133,257,402,545,660,817,908,992,1178,1418,3770";

my @y = split (/,/, $exon_breaks);


sub get_exon_sequence{

my @array = ($y[0]..$y[1]);

my $exon = join "", @x[@array];

print "$exon \n";

shift @y;

}

until($y[1] eq undef){&get_exon_sequence};

2 comments:

Alexis du Bois said...

My dear Ed; you know myself and others are always here, waiting to help you. That is permanent.

Brennen said...

Did you ever arrive at a better solution? If I understand what you want, there's probably a much easier approach using substr() and friends...