libStatGen Software 1
CigarRoller Class Reference

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar. More...

#include <CigarRoller.h>

Inheritance diagram for CigarRoller:
Collaboration diagram for CigarRoller:

Public Member Functions

 CigarRoller ()
 Default constructor initializes as a CIGAR with no operations. More...
 
 CigarRoller (const char *cigarString)
 Constructor that initializes the object with the specified cigarString. More...
 
CigarRolleroperator+= (CigarRoller &rhs)
 Add the contents of the specified CigarRoller to this object. More...
 
CigarRolleroperator+= (const CigarOperator &rhs)
 Append the specified operator to this object. More...
 
CigarRolleroperator= (CigarRoller &rhs)
 Set this object to be equal to the specified CigarRoller. More...
 
void Add (Operation operation, int count)
 Append the specified operation with the specified count to this object. More...
 
void Add (char operation, int count)
 Append the specified operation with the specified count to this object. More...
 
void Add (const char *cigarString)
 Append the specified cigarString to this object. More...
 
void Add (CigarRoller &rhs)
 Append the specified Cigar object to this object. More...
 
bool Remove (int index)
 Remove the operation at the specified index. More...
 
bool IncrementCount (int index, int increment)
 Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement. More...
 
bool Update (int index, Operation op, int count)
 Updates the operation at the specified index to be the specified operation and have the specified count. More...
 
void Set (const char *cigarString)
 Sets this object to the specified cigarString. More...
 
void Set (const uint32_t *cigarBuffer, uint16_t bufferLen)
 Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long. More...
 
int getMatchPositionOffset ()
 DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc. More...
 
const char * getString ()
 Get the string reprentation of the Cigar operations in this object, caller must delete the returned value. More...
 
void clear ()
 Clear this object so that it has no Cigar Operations. More...
 
- Public Member Functions inherited from Cigar
 Cigar ()
 Default constructor initializes as a CIGAR with no operations. More...
 
void getCigarString (String &cigarString) const
 Set the passed in String to the string reprentation of the Cigar operations in this object. More...
 
void getCigarString (std::string &cigarString) const
 Set the passed in std::string to the string reprentation of the Cigar operations in this object. More...
 
void getExpandedString (std::string &s) const
 Sets the specified string to a valid CIGAR string of characters that represent the cigar with no digits (a CIGAR of "3M" would return "MMM"). More...
 
const CigarOperatoroperator[] (int i) const
 Return the Cigar Operation at the specified index (starting at 0). More...
 
const CigarOperatorgetOperator (int i) const
 Return the Cigar Operation at the specified index (starting at 0). More...
 
bool operator== (Cigar &rhs) const
 Return true if the 2 Cigars are the same (the same operations of the same sizes). More...
 
int size () const
 Return the number of cigar operations. More...
 
void Dump () const
 Write this object as a string to cout. More...
 
int getExpectedQueryBaseCount () const
 Return the length of the read that corresponds to the current CIGAR string. More...
 
int getExpectedReferenceBaseCount () const
 Return the number of bases in the reference that this CIGAR "spans". More...
 
int getNumBeginClips () const
 Return the number of clips that are at the beginning of the cigar. More...
 
int getNumEndClips () const
 Return the number of clips that are at the end of the cigar. More...
 
int32_t getRefOffset (int32_t queryIndex)
 Return the reference offset associated with the specified query index or INDEX_NA based on this cigar. More...
 
int32_t getQueryIndex (int32_t refOffset)
 Return the query index associated with the specified reference offset or INDEX_NA based on this cigar. More...
 
int32_t getRefPosition (int32_t queryIndex, int32_t queryStartPos)
 Return the reference position associated with the specified query index or INDEX_NA based on this cigar and the specified queryStartPos which is the leftmost mapping position of the first matching base in the query. More...
 
int32_t getQueryIndex (int32_t refPosition, int32_t queryStartPos)
 Return the query index or INDEX_NA associated with the specified reference offset when the query starts at the specified reference position. More...
 
int32_t getExpandedCigarIndexFromQueryIndex (int32_t queryIndex)
 Returns the index into the expanded cigar for the cigar associated with the specified queryIndex. More...
 
int32_t getExpandedCigarIndexFromRefOffset (int32_t refOffset)
 Returns the index into the expanded cigar for the cigar associated with the specified reference offset. More...
 
int32_t getExpandedCigarIndexFromRefPos (int32_t refPosition, int32_t queryStartPos)
 Returns the index into the expanded cigar for the cigar associated with the specified reference position and queryStartPos. More...
 
char getCigarCharOp (int32_t expandedCigarIndex)
 Return the character code of the cigar operator associated with the specified expanded CIGAR index. More...
 
char getCigarCharOpFromQueryIndex (int32_t queryIndex)
 Return the character code of the cigar operator associated with the specified queryIndex. More...
 
char getCigarCharOpFromRefOffset (int32_t refOffset)
 Return the character code of the cigar operator associated with the specified reference offset. More...
 
char getCigarCharOpFromRefPos (int32_t refPosition, int32_t queryStartPos)
 Return the character code of the cigar operator associated with the specified reference position. More...
 
uint32_t getNumOverlaps (int32_t start, int32_t end, int32_t queryStartPos)
 Return the number of bases that overlap the reference and the read associated with this cigar that falls within the specified region. More...
 
bool hasIndel ()
 Return whether or not the cigar has indels (insertions or delections) More...
 

Friends

std::ostream & operator<< (std::ostream &stream, const CigarRoller &roller)
 Writes all of the cigar operations contained in this roller to the passed in stream. More...
 

Additional Inherited Members

- Public Types inherited from Cigar
enum  Operation {
  none =0 , match , mismatch , insert ,
  del , skip , softClip , hardClip ,
  pad
}
 Enum for the cigar operations. More...
 
- Static Public Member Functions inherited from Cigar
static bool foundInReference (Operation op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInReference (char op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInReference (const CigarOperator &op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInQuery (Operation op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool foundInQuery (char op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool foundInQuery (const CigarOperator &op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool isClip (Operation op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isClip (char op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isClip (const CigarOperator &op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isMatchOrMismatch (Operation op)
 Return true if the specified operation is a match/mismatch operation, false if not. More...
 
static bool isMatchOrMismatch (const CigarOperator &op)
 Return true if the specified operation is a match/mismatch operation, false if not. More...
 
- Static Public Attributes inherited from Cigar
static const int MAX_OP_VALUE = pad
 
static const int32_t INDEX_NA = -1
 Value associated with an index that is not applicable/does not exist, used for converting between query and reference indexes/offsets when an associated index/offset does not exist. More...
 
- Protected Member Functions inherited from Cigar
void clearQueryAndReferenceIndexes ()
 
void setQueryAndReferenceIndexes ()
 
- Protected Attributes inherited from Cigar
std::vector< CigarOperatorcigarOperations
 

Detailed Description

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar.

Docs from Sam1.pdf:

Clipped alignment. In Smith-Waterman alignment, a sequence may not be aligned from the first residue to the last one. Subsequences at the ends may be clipped off. We introduce operation ʻSʼ to describe (softly) clipped alignment. Here is an example. Suppose the clipped alignment is: REF: AGCTAGCATCGTGTCGCCCGTCTAGCATACGCATGATCGACTGTCAGCTAGTCAGACTAGTCGATCGATGTG READ: gggGTGTAACC-GACTAGgggg where on the read sequence, bases in uppercase are matches and bases in lowercase are clipped off. The CIGAR for this alignment is: 3S8M1D6M4S.

If the mapping position of the query is not available, RNAME and CIGAR are set as “*”

A CIGAR string is comprised of a series of operation lengths plus the operations. The conventional CIGAR format allows for three types of operations: M for match or mismatch, I for insertion and D for deletion. The extended CIGAR format further allows four more operations, as is shown in the following table, to describe clipping, padding and splicing:

op Description


M Match or mismatch I Insertion to the reference D Deletion from the reference N Skipped region from the reference S Soft clip on the read (clipped sequence present in <seq>) H Hard clip on the read (clipped sequence NOT present in <seq>) P Padding (silent deletion from the padded reference sequence) CigarRoller is an aid to correctly generating the CIGAR strings necessary to represent how a read maps to the reference.

It is called once a particular match candidate is being written out, so it is far less performance sensitive than the Smith Waterman code below.

Definition at line 66 of file CigarRoller.h.

Constructor & Destructor Documentation

◆ CigarRoller() [1/2]

CigarRoller::CigarRoller ( )
inline

Default constructor initializes as a CIGAR with no operations.

Definition at line 79 of file CigarRoller.h.

80 {
81 clearQueryAndReferenceIndexes();
82 }

◆ CigarRoller() [2/2]

CigarRoller::CigarRoller ( const char *  cigarString)
inline

Constructor that initializes the object with the specified cigarString.

Definition at line 85 of file CigarRoller.h.

86 {
87 Set(cigarString);
88 }
void Set(const char *cigarString)
Sets this object to the specified cigarString.

References Set().

Member Function Documentation

◆ Add() [1/4]

void CigarRoller::Add ( char  operation,
int  count 
)

Append the specified operation with the specified count to this object.

Definition at line 84 of file CigarRoller.cpp.

85{
86 switch (operation)
87 {
88 case 0:
89 case 'M':
90 Add(match, count);
91 break;
92 case 1:
93 case 'I':
94 Add(insert, count);
95 break;
96 case 2:
97 case 'D':
98 Add(del, count);
99 break;
100 case 3:
101 case 'N':
102 Add(skip, count);
103 break;
104 case 4:
105 case 'S':
106 Add(softClip, count);
107 break;
108 case 5:
109 case 'H':
110 Add(hardClip, count);
111 break;
112 case 6:
113 case 'P':
114 Add(pad, count);
115 break;
116 case 7:
117 case '=':
118 Add(match, count);
119 break;
120 case 8:
121 case 'X':
122 Add(match, count);
123 break;
124 default:
125 // Hmmm... what to do?
126 std::cerr << "ERROR "
127 << "(" << __FILE__ << ":" << __LINE__ <<"): "
128 << "Parsing CIGAR - invalid character found "
129 << "with parameter " << operation << " and " << count
130 << std::endl;
131 break;
132 }
133}
void Add(Operation operation, int count)
Append the specified operation with the specified count to this object.
Definition: CigarRoller.cpp:77
@ del
deletion from the reference (the reference contains bases that have no corresponding base in the quer...
Definition: Cigar.h:92
@ hardClip
Hard clip on the read (clipped sequence not present in the query sequence or reference)....
Definition: Cigar.h:95
@ match
match/mismatch operation. Associated with CIGAR Operation "M"
Definition: Cigar.h:89
@ pad
Padding (not in reference or query). Associated with CIGAR Operation "P".
Definition: Cigar.h:96
@ insert
insertion to the reference (the query sequence contains bases that have no corresponding base in the ...
Definition: Cigar.h:91
@ skip
skipped region from the reference (the reference contains bases that have no corresponding base in th...
Definition: Cigar.h:93
@ softClip
Soft clip on the read (clipped sequence present in the query sequence, but not in reference)....
Definition: Cigar.h:94

References Add(), Cigar::del, Cigar::hardClip, Cigar::insert, Cigar::match, Cigar::pad, Cigar::skip, and Cigar::softClip.

◆ Add() [2/4]

void CigarRoller::Add ( CigarRoller rhs)
inline

Append the specified Cigar object to this object.

Definition at line 109 of file CigarRoller.h.

110 {
111 (*this) += rhs;
112 }

◆ Add() [3/4]

void CigarRoller::Add ( const char *  cigarString)

Append the specified cigarString to this object.

Definition at line 136 of file CigarRoller.cpp.

137{
138 int operationCount = 0;
139 while (*cigarString)
140 {
141 if (isdigit(*cigarString))
142 {
143 char *endPtr;
144 operationCount = strtol((char *) cigarString, &endPtr, 10);
145 cigarString = endPtr;
146 }
147 else
148 {
149 Add(*cigarString, operationCount);
150 cigarString++;
151 }
152 }
153}

References Add().

◆ Add() [4/4]

void CigarRoller::Add ( Operation  operation,
int  count 
)

Append the specified operation with the specified count to this object.

Definition at line 77 of file CigarRoller.cpp.

78{
79 CigarOperator rhs(operation, count);
80 (*this) += rhs;
81}

Referenced by Add(), GreedyTupleAligner< QueryType, ReferenceType, ReferenceIndex >::Align(), Set(), SamFilter::softClip(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

◆ clear()

void CigarRoller::clear ( )

Clear this object so that it has no Cigar Operations.

Definition at line 325 of file CigarRoller.cpp.

326{
327 // Clearing the cigar, so the query & reference indexes are out of
328 // date, so clear them.
329 clearQueryAndReferenceIndexes();
330 cigarOperations.clear();
331}

Referenced by GreedyTupleAligner< QueryType, ReferenceType, ReferenceIndex >::Align(), operator=(), Set(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

◆ getMatchPositionOffset()

int CigarRoller::getMatchPositionOffset ( )

DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc.

Definition at line 244 of file CigarRoller.cpp.

245{
246 int offset = 0;
247 std::vector<CigarOperator>::iterator i;
248
249 for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
250 {
251 switch (i->operation)
252 {
253 case insert:
254 offset += i->count;
255 break;
256 case del:
257 offset -= i->count;
258 break;
259 // TODO anything for case skip:????
260 default:
261 break;
262 }
263 }
264 return offset;
265}

References Cigar::del, and Cigar::insert.

◆ getString()

const char * CigarRoller::getString ( )

Get the string reprentation of the Cigar operations in this object, caller must delete the returned value.

Definition at line 272 of file CigarRoller.cpp.

273{
274 // NB: the exact size of the string is not important, it just needs to be guaranteed
275 // larger than the largest number of characters we could put into it.
276
277 // we do not explicitly manage memory usage, and we expect when program exits, the memory used here will be freed
278 static char *ret = NULL;
279 static unsigned int retSize = 0;
280
281 if (ret == NULL)
282 {
283 retSize = cigarOperations.size() * 12 + 1; // 12 == a magic number -> > 1 + log base 10 of MAXINT
284 ret = (char*) malloc(sizeof(char) * retSize);
285 assert(ret != NULL);
286
287 }
288 else
289 {
290 // currently, ret pointer has enough memory to use
291 if (retSize > cigarOperations.size() * 12 + 1)
292 {
293 }
294 else
295 {
296 retSize = cigarOperations.size() * 12 + 1;
297 free(ret);
298 ret = (char*) malloc(sizeof(char) * retSize);
299 }
300 assert(ret != NULL);
301 }
302
303 char *ptr = ret;
304 char buf[12]; // > 1 + log base 10 of MAXINT
305
306 std::vector<CigarOperator>::iterator i;
307
308 // Progressively append the character representations of the operations to
309 // the cigar string we allocated above.
310
311 *ptr = '\0'; // clear result string
312 for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
313 {
314 sprintf(buf, "%d%c", (*i).count, (*i).getChar());
315 strcat(ptr, buf);
316 while (*ptr)
317 {
318 ptr++; // limit the cost of strcat above
319 }
320 }
321 return ret;
322}

◆ IncrementCount()

bool CigarRoller::IncrementCount ( int  index,
int  increment 
)

Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement.

Returns
true if it is successfully incremented, false if not.

Definition at line 171 of file CigarRoller.cpp.

172{
173 if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
174 {
175 // can't update, out of range, return false.
176 return(false);
177 }
178 cigarOperations[index].count += increment;
179
180 // Modifying the cigar, so the query & reference indexes are out of date,
181 // so clear them.
182 clearQueryAndReferenceIndexes();
183 return(true);
184}

Referenced by SamRecord::shiftIndelsLeft().

◆ operator+=() [1/2]

CigarRoller & CigarRoller::operator+= ( CigarRoller rhs)

Add the contents of the specified CigarRoller to this object.

Definition at line 29 of file CigarRoller.cpp.

30{
31 std::vector<CigarOperator>::iterator i;
32 for (i = rhs.cigarOperations.begin(); i != rhs.cigarOperations.end(); i++)
33 {
34 (*this) += *i;
35 }
36 return *this;
37}

◆ operator+=() [2/2]

CigarRoller & CigarRoller::operator+= ( const CigarOperator rhs)

Append the specified operator to this object.

Definition at line 43 of file CigarRoller.cpp.

44{
45 // Adding to the cigar, so the query & reference indexes would be
46 // incomplete, so just clear them.
47 clearQueryAndReferenceIndexes();
48
49 if (rhs.count==0)
50 {
51 // nothing to do
52 }
53 else if (cigarOperations.empty() || cigarOperations.back() != rhs)
54 {
55 cigarOperations.push_back(rhs);
56 }
57 else
58 {
59 // last stored operation is the same as the new one, so just add it in
60 cigarOperations.back().count += rhs.count;
61 }
62 return *this;
63}

◆ operator=()

CigarRoller & CigarRoller::operator= ( CigarRoller rhs)

Set this object to be equal to the specified CigarRoller.

Definition at line 66 of file CigarRoller.cpp.

67{
68 clear();
69
70 (*this) += rhs;
71
72 return *this;
73}
void clear()
Clear this object so that it has no Cigar Operations.

References clear().

◆ Remove()

bool CigarRoller::Remove ( int  index)

Remove the operation at the specified index.

Returns
true if successfully removed, false if not.

Definition at line 156 of file CigarRoller.cpp.

157{
158 if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
159 {
160 // can't remove, out of range, return false.
161 return(false);
162 }
163 cigarOperations.erase(cigarOperations.begin() + index);
164 // Modifying the cigar, so the query & reference indexes are out of date,
165 // so clear them.
166 clearQueryAndReferenceIndexes();
167 return(true);
168}

Referenced by SamRecord::shiftIndelsLeft(), and CigarHelper::softClipEndByRefPos().

◆ Set() [1/2]

void CigarRoller::Set ( const char *  cigarString)

Sets this object to the specified cigarString.

Definition at line 204 of file CigarRoller.cpp.

205{
206 clear();
207 Add(cigarString);
208}

References Add(), and clear().

Referenced by CigarRoller(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

◆ Set() [2/2]

void CigarRoller::Set ( const uint32_t *  cigarBuffer,
uint16_t  bufferLen 
)

Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long.

Definition at line 211 of file CigarRoller.cpp.

212{
213 clear();
214
215 // Parse the buffer.
216 for (int i = 0; i < bufferLen; i++)
217 {
218 int opLen = cigarBuffer[i] >> 4;
219
220 Add(cigarBuffer[i] & 0xF, opLen);
221 }
222}

References Add(), and clear().

◆ Update()

bool CigarRoller::Update ( int  index,
Operation  op,
int  count 
)

Updates the operation at the specified index to be the specified operation and have the specified count.

Returns
true if it is successfully updated, false if not.

Definition at line 187 of file CigarRoller.cpp.

188{
189 if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
190 {
191 // can't update, out of range, return false.
192 return(false);
193 }
194 cigarOperations[index].operation = op;
195 cigarOperations[index].count = count;
196
197 // Modifying the cigar, so the query & reference indexes are out of date,
198 // so clear them.
199 clearQueryAndReferenceIndexes();
200 return(true);
201}

Referenced by SamRecord::shiftIndelsLeft().

Friends And Related Function Documentation

◆ operator<<

std::ostream & operator<< ( std::ostream &  stream,
const CigarRoller roller 
)
friend

Writes all of the cigar operations contained in this roller to the passed in stream.

Definition at line 167 of file CigarRoller.h.

168{
169 stream << roller.cigarOperations;
170 return stream;
171}

The documentation for this class was generated from the following files: